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SUMMARY 


The  CD4  molecules  on  the  target  macrophage  and  T  cell  are  the  primary  receptors  for  the 
HIV-1  surface  glycoprotein,  gpl20.  In  addition,  chemokine  receptors  on  the  macrophage  and  T  cell 
serve  as  co-receptors  in  the  virus-cell  interactions.  An  understanding  of  the  mechanism  of  virus-cell 
interactions  requires  quantitative  analyses  of  the  structure-function  correlations  of  the  surface  epitopes  on 
gpl20  which  contains  several  constant  (C)  and  variable  (V)  subdomains  linked  as  C1-V1-V2-C2-V3-C3- 
V4-C4-V5-C5.  The  surface  epitope  inside  the  C4  loop  is  critical  for  CD4  binding.  The  epitopes  inside 
the  VI- V2  and  V3  loops  elicit  HIV-l  neutralizing  response  as  well  as  determine  tropism,  fusion,  and 
infectivity  of  the  virus.  In  absence  of  a  high  resolution  structure  of  the  entire  gpl20,  we  have  adopted  an 
alternative  approach  to  analyzing  the  stmctural  properties  of  these  surface  epitopes.  For  this  purpose,  we 
have  combined  theoretical  and  experimental  techniques  including  sequence  analysis,  molecular  modeling, 
polypeptide  engineering,  NMR  spectroscopy,  antibody  binding,  and  neutralization  assays.  First,  we 
have  analyzed  the  sequence-structure-antigenicity  correlations  of  the  third  variable  (V3)  loop  of  gpl20 
both  as  a  cyclic  35  amino  acid  long  peptide  and  in  the  context  of  the  native  gpl20.  Second,  we  have 
obtained  average  low-energy  structures  of  various  other  subdomains  of  gpl20  including  the  V1-V2  and 
V4-C4  loops.  Finally,  we  have  constructed  a  working  model  of  gpl20  based  upon  the  knowledge  of 
(i)  the  structures  of  the  gpl20  subdomains  obtained  by  molecular  modeling  in  conjunction  with  NMR  and 
other  spectroscopic  data,  (ii)  the  surface  exposure  data  of  various  contiguous  regions  in  gpl20,  and  (iii) 
the  data  on  subdomain-subdomain  interactions  obtained  from  replication  competency,  monoclonal 
antibody  binding,  fusion,  and  infectivity  assays.  The  working  model  of  gpl20  adequately  describes  the 
sequence-structure-binding  properties  of  the  functional  epitopes  on  gpl20  such  as  those  inside  VI- V2, 
V3,  and  C4. 

(Key  Words:  gpl20  subdomains/  NMR  and  modeling!  a  working  model  of  gpllO)/  interacting 
functional  epitopes) 
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A.  RESEARCH  OBJECTIVES 


Interaction  between  the  surface  glycoprotein,  gpl20,  and  the  CD4  receptor  on  the  target  T 
cell  and  macrophage  define  the  first  step  in  HIV-1  pathogenesis  [1-12].  gpl20  is  made  up  of  several 
well-defined  disulfide-bridged  constant  (C)  and  variable  (V)  subdomains  or  loops  [4]  linked  as  Cl- VI- 
V2-C2-V3-C3-V4-C4-V5-C5  (see  Figure  1).  Receptor  and  antibody  binding  experiments  [13-19]  reveal 
that  a  discontinuous  epitope  formed  by  residues  in  C2,  C3,  and  C4  define  the  contact  interface  for  CD4 
binding;  the  region  inside  C4  is  most  critical  for  binding.  Although,  it  is  not  directly  involved  in  CD4 
binding,  the  cyclic  35  amino  acid  (aa)  long  V3  loop  of  gpl20  contains  neutralizing  epitopes,  i.e., 
monoclonal  antibodies  (mAb)  directed  against  sequences  inside  the  V3  loop  can  neutralize  HIV-1  [21- 
32].  Therefore,  it  appears  that  the  V3  loop  is  involved  in  the  post-CD4  binding  phase  of  viral 
pathogenesis.  In  fact,  it  has  been  reported  that  the  V3  loop  is  involved  in  HTV-l  tropism,  cell-virus 
fusion,  and  replication  competency  of  the  virus  [52-56,  60-65].  However,  the  effective  use  of  the  V3 
loop  as  a  neutralizing  target  has  been  complicated  by  its  extensive  sequence  variation  across  HIV-1 
isolates. 

(Research  Objective  1)  During  submission  of  this  project,  our  primary  research  objective  was  to 
theoretically  predict  and  to  test  by  two-dimensional  nuclear  magnetic  resonance  (2D  NMR)  spectroscopy 
how  the  sequence  variation  correlates  with  the  global  structure  of  the  V3  loop  and  the  local  structure  of 
the  neutralizing  determinant  (ND)  inside  the  V3  loop. 

Since  the  submission  of  this  proposal  in  1991,  the  following  major  advances  have  been 
made  in  defining  the  surface  epitopes  of  gpl20  involved  in  HIV-1  pathogenesis. 

(i)  In  addition  to  the  V3  loop,  the  V1-V2  loop  has  also  been  shown  to  contain  neutralizing  epitopes  [33- 
39]  and  it  is  also  implicated  in  cell  tropism,  viral  fusion,  and  replication  competency  [57-59]. 

(ii)  In  addition  to  the  CD4  molecule  (the  primary  receptor),  the  CC  or  CXC  chemokine  receptor  (CCR  or 
CXCR)  on  the  target  T  cell  and  macrophage  acts  as  a  co-receptor  during  viral  fusion  [5-12].  There  is 
uncertainty  about  the  exact  site  on  gpl20  that  is  responsible  for  CCR  (or  (CXCR)  binding.  Studies  with 
different  gpl20  chimeras  indicate  that  the  V3  loop  (and  not  the  VI- V2  loop)  is  important  for  binding  of 
gpl20  to  CCR  on  macrophages  [10].  However,  since  the  35  aa  long  V3  loop  peptide  in  linear  or  cyclic 
form  shows  no  binding  to  CCR  or  CXCR  [20],  it  implies  that  the  V3  loop  should  be  presented  in  the 
context  of  the  native  gpl20  such  that  its  local  stracture  and  interactions  with  other  loops  are  preserved. 

(iii)  Indeed,  for  quite  some  time  it  has  been  recognized  that  different  loops  of  gpl20,  that  are  distant  in 
sequence,  can  functionally  (or  spatially)  interact  with  each  other  in  the  native  protein  [40-50].  Of 
particular  importance  are  the  interactions  involving  V3  and  C4.  This  interactions  may,  in  fact,  be  critical 
in  determining  HIV  tropism,  mAb  binding,  and  viral  fusion. 

These  research  advances  promoted  us  to  expand  our  research  objectives. 
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Figure  1.  Constant  (C)  and  variable  (V)  loops  in  gpl20.  Surface  exposures  of  various  contiguous  aa 
segments  in  gp  120  are  also  included  [data  taJcen  from  ref.  69]. 
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In  absence  of  a  high  resolution  structure  of  gpl20  by  X-ray  crystallography,  the  currently 
available  data  in  the  literature  fall  short  of  explaining  the  structural  basis  for  epitope  recognition  either  by 
receptors  or  by  mAbs.  Therefore,  we  have  attempted  to  develop  and  apply  an  alternative  method  for 
obtaining  a  working  model  of  gpl20  that  accurately  defines  the  structural  properties  of  various  functional 
epitopes.  Characterizations  of  these  epitopes  in  terms  of  their  sequence- structure-binding  correlations 
will  help  us  better  understand  the  pathogenesis  of  HIV- 1. 

(Research  Objective  2).  To  determine  the  structure-antigenicity  correlations  of  the  V3  loop  both  as  a 
cyclic  peptide  and  in  the  context  of  the  native  gpl20.  For  this  purpose,  we  have  combined  (molecular 
modeling  and  2D  NMR  spectroscopy)  with  (mAb  binding  and  neutralization  studies). 

(Research  Objective  3).  To  obtain  a  working  model  of  gpl20  that  accurately  defines  the  local 
structures  of  the  individual  loops  and  inter-loop  interactions.  For  this  purpose,  we  have  first  determined 
the  models  of  V3,  V1-V2,  and  V4-C4  subdomains  of  gpl20.  We  have  then  used  a  simulated  annealing 
method  to  assemble  these  subdomains  by  mainly  utilizing  the  flexibility  of  the  linker  regions.  The  final 
set  of  working  models  has  been  obtained  by  screening  the  sampled  structures  against  (i)  the  surface 
exposure  data  of  different  contiguous  regions  on  gpl20  fi’om  the  immunochemical  maps  [21-39,  69]  and 
(ii)  the  inter-domain  interaction  data  from  replication  competency,  fusion,  and  infectivity  assays  [40-50]. 

Successful  completion  of  these  research  objectives  enables  us  to  obtain  a  comprehensive 
knowledge  of  the  sequence-structure- binding  correlations  of  various  surface  epitopes  of  gpl20  that  are 
involved  either  in  viral  pathogenesis  or  in  eliciting  neutralizing  immune  response.  This  knowledge  can  be 
utilized  in  designing  antigens  for  directing  immunity  against  HIV-1.  For  example,  we  have  been  able  to 
design  a  multivalent  HIV-1  antigen  in  which  the  conserved  structural  element  of  the  V3  loop  is  multiply 
presented. 


B.  BACKGROUND  AND  RATIONALE 

In  the  literature,  three  types  of  information  are  available  about  the  disulfide-bridged 
subdomains  (or  loops)  of  gpl20:  (i)  functional  roles  of  these  loops  in  viral  replication,  fusion,  infectivity, 
and  cell  tropism  [52-65],  (ii)  surface  exposures  of  these  loops  [21-39,  69],  and  (iii)  inter-loop 
interactions  [40-50].  Very  recently  it  has  been  shown  that  gpl20  also  contains  epitopes  for  co-receptors 
present  on  CD4-I-  T  cells  and  macrophages  [5-12]. 

(i)  Functional  Roles 

The  V3  loop  of  gpl20  has  been  extensively  studied.  Sequence  determinants  for  HIV-1 
tropism  have  been  located  inside  the  V3  loop  [52-56].  It  has  also  been  reported  that  the  net  positive 
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Smnmarv  of  HIV-1  GP120  V2  epitopes 

190 


CRA-3 


Fipre  2.  Description  of  linear  and  conformational  epitopes  [36]  in  the  V1-V2  loop  with  three  disulfide 
bridges.  T^e  linear  epitope  for  mAb,  BAT085,  includes  the  predicted  helical  segment  inside  V2.  The 
conformational  epitope  (shaded)  for  CRA-3  consists  of  non-contiguous  aa  segments.  Secondary 
structural  elements  are  as  predicted  by  our  modeling  method  (see  later  in  Figure  11  A):  helix=cy Under, 
beta  strand=aiTow. 
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Figure  3.  CD4-blc)ckmg  mAbs  raised  against  MN-gpl20  [15]  are  directed  against  the  C4  helix  (shaded) 
of  V4-C4.  Secondary  structural  elements  are  as  predicted  by  our  modeling  method  (see  later  in  Figure 
IIB):  helix=cylinder,  beta  strand=arrow. 
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'  charge  on  the  V3  loop  is  a  possible  indicator  for  syncytium  inducing  (SI)  abiUty  of  a  HIV-1  strain,  i.e., 
SI  strains  have  high  net  positive  charge  on  their  V3  loops  whereas  the  non-SI  strains  have  low  net 
positive  charges  on  their  V3  loops  [60-65].  However,  it  has  also  been  argued  that  HTV-l  tropism  and  SI 
ability  may  not  only  be  due  to  the  V3  loop  (and  its  charge)  but  also  be  due  to  its  interactions  with  other 
loops  (C4  in  particular)  of  gpl20.  In  addition  to  the  V3  loop,  the  V 1-V2  loop  has  also  been  implicated  in 
HIV-1  tropism.  Interestingly,  the  level  of  glycosylation,  the  net  charge,  the  length  of  the  V2  loop  seem 
to  contribute  to  the  observed  tropism  [57-59].  Table  1  shows  sequences  of  various  V1-V2  loops  to 
document  variations  in  glycosylation,  the  net  charge,  and  the  length  of  the  V2  loop.  Table  2  shows 
sequences  of  various  V4-C4  loops;  note  that  although  V4  shows  extensive  sequence  variation,  C4  is 
fairly  conserved  especially  in  the  putative  helical  segment. 

Also,  both  V3  and  VI- V2  loops  have  been  shown  to  play  important  roles  in  viral 
replication,  virus-cell  fusion,  viral  infectivity  [57-65].  Single  site  mutations  at  the  highly  conserved 
GPGR-crest  drastically  reduce  the  virus-cell  fusion  [62].  Also  elimination  of  the  disulfide  bridge 
between  the  conserved  1st  and  35th  Cs  in  the  35  aa  long  V3  loop  completely  disables  the  cleavage  of 
gpl60  into  gpl20  and  gp41,  which  is  a  pre-requisite  for  virus-cell  fusion.  Similarly,  single  and  double 
site  mutations  inside  the  V1-V2  loop  alter  the  SI  ability  of  the  mutated  HIV-1  strains  [59].  It  has  also 
been  proposed  that  a  putative  proteolytic  site  inside  the  V3  loop  may  determine  viral  fusion  to  T  cell 
which  bears  a  membrane-bound  protease  [66-68]. 

(ii)  Surface  Exposures 

Antibody  binding  data  are  useful  for  determining  the  surface  exposure  of  various  epitopes 
on  gpl20.  Specificity  of  a  given  epitope  on  gpl20  for  a  given  mAb  implies  that  the  epitope  in  question  is 
either  permanently  or  transiently  exposed.  Mainly  two  classes  of  murine  and  human  anti-gpl20-mAbs 
are  reported  in  literature:  (a)  mAbs  that  recognize  epitopes  on  the  variable  V3  and  V1-V2  loops  and  (b) 
mAbs  that  recognize  epitopes  on  the  constant  C2,  C3,  and  C4  loops.  The  mAbs  specific  for  the  variable 
regions  of  gpl20  neutralize  HIV-1  in  a  type-specific  manner,  whereas  the  mAbs  specific  for  the  constant 
regions  of  gpl20  show  broadly  cross-reactive  neutralizing  activity  [16].  However,  the  latter  are  quite 
difficult  to  raise  in  animals  and  rmce  probably  due  to  epitope  masking. 

The  majority  of  the  V3-specific  mAbs  bind  to  the  crest  of  the  loop  that  contains  the  GPGR 
(3  turn  [21-32].  For  some  of  these  mAbs,  the  epitopes  include  the  N-terminal  sequence  flanking  the 
GPGR,  whereas  for  some  other  mAbs  the  epitopes  include  the  C-terminal  sequence  flanking  the  GPGR; 
for  a  few  mAbs  (e.g.,  mAb  9284),  the  epitopes  include  both  the  N  and  C-terminal  sequences  flanking  the 
GPGR.  A  new  type  of  V3-specific  mAb  has  also  been  identified;  this  mAb  recognizes  the  sequence  N- 
terminal  to  the  1st  C  involved  in  the  disulfide  bridge  only  when  gpl20  is  denatured  or  in  presence  of  a 
mAb  that  is  specific  for  the  CD4-binding  region  of  gpl20  [50].  This  means  that  the  N-terminal  sequence 
in  the  V3  loop  is  masked  in  the  native  gpl20  and  it  is  unmasked  in  presence  of  denaturing  agents  or 
mAbs  specific  for  the  CD4-binding  region. 
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Both  murine  and  human  VI- V2  specific  mAbs  have  been  isolated  [33-39].  These  mAbs 
are  specific  either  for  linear  or  conformational  epitopes  inside  the  V1-V2  loop  (see  Figure  2).  Single  and 
double  site  mutations  on  gpl20  that  diminish  or  enhance  the  mAb  binding  have  also  been  reported. 
These  mutations  generally  lie  inside  the  V1-V2  loop  for  mAbs  specific  for  linear  epitopes  whereas  they  lie 
both  inside  and  outside  the  V 1-V2  loop  for  mAbs  specific  for  conformational  epitopes  [36]. 

Binding  studies  reveal  that  a  discontinuous  epitope  formed  by  C2,  C3,  and  C4  regions  on 
gpl20  determine  the  specificity  for  CD4  binding.  CD4-blocking  mAbs  compete  for  the  sites  on  C2,  C3, 
and  C4  (in  particular)  for  binding  (see  Figure  3).  CD4-blocking  mAbs  are  generally  cross-reactive  across 
HIV-1  isolates  [15-19]. 

Moore  and  co-workers  have  identified  surface  exposures  of  various  sites  on  gp  120  based 
upon  a  detailed  analyses  of  mAb  and  CD4  binding  data  [69].  These  analyses  allow  classification  of 
different  regions  of  gpl20  as  well-exposed,  partially  exposed,  and  completely  buried,  although  the 
surface  exposures  of  several  other  regions  of  gpl20  remain  undetermined  (see  Figure  1).  The  surface 
exposure  of  an  aa  segment  in  a  gp  120  model  can  be  determined  by  computing  the  accessible  surface  areas 
of  the  residues  (X)  in  the  segment  relative  to  those  in  the  extended  GXG  tripeptide.  In  a  well-exposed 
segment,  accessible  surface  areas  of  the  residues  (X)  should  be  larger  than  those  in  GXG.  Similarly  in  a 
buried  segment,  accessible  surface  areas  of  the  residues  (X)  should  be  lower  than  those  in  GXG. 

(ii)  Inter-loop  Interactions 

The  binding  data  not  only  provide  information  about  local  structures  and  surface 
exposures  of  various  loops  but  also  information  about  long-range  loop-loop  interactions.  In  a  continuous 
conformational  epitope,  the  structure  of  the  epitope  is  stabilized  not  only  by  its  amino  acid  sequence  and 
local  disulfide  bridge  but  also  by  its  interactions  with  other  regions  distant  in  sequence;  the 
conformational  epitope  in  V2  is  one  such  example  [36].  Long-range  loop-loop  interactions  are  also 
relevant  for  the  discontinuous  epitope  in  which  regions  distant  in  sequence  come  close  in  space  to  create 
the  contact  interface  for  mAb  binding;  the  discontinuous  epitope  for  the  CD4-blocking  mAb,  1 125H,  is 
one  such  example  [13].  In  addition  to  the  mAb  binding  data,  replication  competency,  fusion,  and 
infectivity  assays  reveal  functional  (and  perhaps  spatial)  interactions  between  pairs  of  loops  in  gpl20  [40- 
51],  e.g.,  (V3  and  V1-V2),  (V3  and  C4),  (VI  and  C4),  etc.  It  is  important  to  examine  whether  the  inter¬ 
loop  interactions  fi'om  the  functional  assay  indeed  correspond  to  spatial  interactions  in  our  gpl20  model. 
For  example,  the  N-terminal  V3  sequence  appears  to  be  functionally  correlated  with  the  putative  helical 
segment  in  C4.  We  have,  therefore,  examined  the  spatial  proximity  of  these  two  fragments  in  our  gpl20 
model.  Similarly  since  W427  in  C4  functionally  interacts  with  residues  in  VI,  we  have  examined  the 
spatial  proximity  of  W427  with  other  residues  and  in  particular  the  residues  inside  VI.  In  fact,  we  have 
tested  spatial  interactions  for  residue-pairs  involved  in  long-range  functional  interactions. 

In  this  project,  special  attention  is  paid  to  deciphering  the  role  of  the  variable  loops  of 
gpl20  in  surface  recognition.  We  believe  that  the  interplay  between  the  constant  and  variable  regions  of 
gpl20  is  critical  in  determining  tropism,  fusion,  and  infectivity  although  a  truncated  gpl20  with  only  the 
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constant  regions  (i.e.,  without  V1-V2  and  V3)  can  produce  high  affinity  binding  to  CD4  [73]. 
Previously,  linear  and  cyclic  peptides  with  V3  epitopes  were  used  as  immunogens  [74].  Recently,  a 
bispecific  linear  peptide  containing  the  V3  and  C4  epitopes  [75]  and  peptamer  containing  several  putative 
C4  helical  segments  [76]  have  also  been  tried  as  immunogens.  While  the  potential  use  of  these 
constructs  as  HIV  vaccines  remains  to  be  seen,  the  subdomains  chosen  in  our  study  will  have  equal  (if 
not  better)  promise  as  immunogens  since  they  closely  mimic  the  native  gpl20  (the  natural  immunogen) 
due  to  the  presence  of  disulfide  bridges  and  inter-loop  interactions. 

(iv)  Surface  Epitopes  on  gpl20  for  CCR  OR  CXCR  (or  co-receptor)  Binding 

Recent  studies  have  shown  that  the  binding  of  gpl20  to  CD4  on  T  cells  or  macrophages  is 
not  enough  for  virus-cell  fusion.  Additional  factors  (or  co-receptors)  are  needed  to  complete  the  gpl20- 
mediated  membrane  fusion  that  is  a  prerequisite  for  HTV-l  infection.  On  macrophages  these  receptors  are 
CCR5  or  CCR3  [7-8,  10]  and  on  T  cells  they  are  fusins  or  CXCR  [9].  Both  CCR  and  fusin  are  G- 
protein  coupled  receptors  with  seven  transmembrane  spanning  helix  segments  [7-12].  Use  of  various 
gpl20  chimera  suggests  that  the  V3  loop  (and  not  the  V1-V2  loop)  is  probably  involved  in  the  binding  of 
gpl20  to  CCR.  Similarly,  the  V3  loop  may  also  be  involved  in  the  binding  of  gpl20  to  fusin  since  the 
V3-specific  mAb,  D47,  blocks  HTV-l  fusion  to  T  cells  bearing  CD4  and  fusin.  However,  attempts  in 
various  laboratories  have  failed  to  demonstrate  that  the  35  aa  long  V3  loop  peptide  itself  binds  to  CCR  or 
fusin  (or  CXCR).  Since  it  shows  characteristic  difference  in  sequence  between  macrophage  and  T  cell, 
the  V3  loop  appears  to  be  the  logical  target  for  CCR  and  fusin,  i.e.,  V3-specific  recognition  of  gpl20  by 
CCR  and  fusin  is  expected  to  guide  the  HTV-l  fusion  to  macrophage  and  T  cell,  respectively.  Various 
studies  [56]  indicate  that  the  structure  of  the  V3  loop  and  its  interactions  with  other  loops  (rather  than  the 
V3  loop  alone)  determine  HTV-l  tropism.  In  a  similar  manner,  the  structure  of  the  V3  loop  and  its 
interactions  with  other  loops  (rather  than  the  V3  loop  alone)  may  also  determine  the  binding  of  gpl20  to 
CCR  or  fusin. 


C.  PROGRESS  REPORT 

We  have  studied  (in  collaboration  with  the  NCI/NIH  and  CDC)  the  stracture-antigenicity 
correlations  of  the  V3  loops  as  isolated  cyclic  peptides  and  also  in  the  context  of  the  native  gpl20.  For 
this  work,  we  have  combined  various  theoretical  and  experimental  tools  including  sequence  analyses, 
molecular  modeling,  NMR  spectroscopy,  mAb  binding  by  ELISA  and  BIAcore,  syncytium,  and 
neutralization  assays  [77-87].  The  work  on  the  V3  loop  has  resulted  in  the  identification  of  a  conserved 
structural  element  at  the  crest  of  the  V3  loop  and  subsequently  we  have  designed  multivalent  V3-specific 
HTV-l  antigens  in  which  this  conserved  structural  element  has  been  multiply  expressed.  However,  we 
have  not  only  restricted  this  project  to  the  V3  loop  but  also  included  studies  on  the  V1-V2  and  V4-C4 
loops  of  gpl20.  We  have  also  utilized  the  circular  dichroism  (CD)  data  that  show  the  presence  of  a 
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putative  a-helix  in  the  V1-V2  and  V4-C4  loops  and  explained  how  these  data  are  utilized  in  obtaining 
molecular  models  of  V1-V2  and  V4-C4.  Finally,  we  have  developed  and  applied  a  molecular  modeling 
method  for  assembling  different  subdomains  of  gp  120  in  a  native  fold. 

C.l  Cyclic  V3  Loops:  Sequence-Structure-Antigenicity  Correlations 

C.la  Structural  Requirement  for  Antigen-Antibody  Interactions:  A  Case  Study  of  the 
HIV-MN  V3  Antigen 

Theoretical  studies  [77-78]  revealed  that  the  variability  in  sequence  and  structure  of  the  V3 
loop  is  confined  to  the  N-  and  C-terminal  sides  of  the  conserved  GPG-crest.  This  leaves  three  regions  of 
the  V3  loop  conserved  both  in  sequence  and  secondary  structure.  Figiue  4  shows  the  V3  loop  sequences 
of  various  HIV-1  isolates;  the  three  conserved  secondary  structural  elements  are  underlined. 

Figure  4.  Cyclic  V3  and  Mini-V3  Loops 

burns  burn  helix 

V3-MN  : CTRPN YNKRKRIHIGPGRAF YTTKNI IGT IRQAHC 

Mini  V3  :  CRIHIGPGRAFYTTKC 

V3-RF  : - N-T — S-TK - VI-A-GQ - D— K - 

V3-Florida  : - YT — G-R - V-AAEK - D — R - 

Mini  V3  :  CG-R - V-AAEC 

V3-Haiti  : - D-T — S-PM - K A-GD - N - 

V3-Thailand : - SN-T-TS-T - QV — R-GD - D — K-Y- 

Mini  V3  :  CS-T - QV — R-GC 

We  have  carried  out  NMR  studies  [79-80,  83]  to  test  the  validity  of  our  theoretical 
predictions.  Structural  studies  were  performed  for  the  HIV-MN  V3  loop  in  the  linear  and  cyclic  (S-S 
bridged)  forms.  While  the  linear  V3  loop  in  water  is  devoid  of  ordered  structure  except  for  a  loose  turn  at 
the  GPG-crest,  the  cyclic  form  shows  a  well  defined  structure  in  water.  Moreover,  in  (7:3)  water:TFE 
mixed  solvent  (less  polar  than  water),  the  cyclic  V3  loop  shows  higher  order  both  in  terms  of  secondary 
structure  content  and  rigidity.  The  three  conserved  regions  of  the  HIV-MN  V3  loop  in  the  mixed  solvent 
adopt  the  predicted  secondary  structural  elements.  TFE-induced  helix  formation  in  the  C-terminal 
segment  is  also  documented  by  us  in  the  Haitian  V3  loop.  TFE  induced,  less  polar  environment  for  helix 
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stabilization  is  biologically  relevant  in  the  context  of  the  native  structure  of  gpl20.  As  discussed  in  the 
following  section,  our  modeling  studies  show  that  the  proximity  of  the  C3  region  provides  a  hydrophobic 
environment  for  the  C-terminal  segment  of  the  V3  loop  (see  Figure  10). 

The  observation  of  the  three  conserved  secondary  structures  in  the  V3  loop  leads  to  a 
simple  rule  that  the  sequence  variability  of  the  V3  loop  can  be  tracked  by  finding  the  associated  variability 
brought  about  by  different  structural  elements  on  either  side  of  the  GPG-crest.  Finally,  the 
conformational  requirement  of  the  neutralizing  determinant  (ND)  in  the  V3  loop-antibody  interaction  is 
tested  by  monitoring  the  mAb  binding  to  the  HIV-MN  V3  loop  in  the  linear  and  cyclic  forms  by  ELISA 
[79].  The  cyclization  through  the  (S-S)-bridge  between  Cl  and  C35  and  changing  the  solvent 
environment  provide  interesting  insights  into  the  structure-binding  correlation  of  the  HIV  V3  loop. 
Binding  of  linear  and  cyclic  V3-MN  loops  to  three  different  monoclonal  antibodies  are  compared  in 
Figure  5.  Antibodies  1510,  1511,  and  1289  bind  to  the  V3  epitopes  KRIHI,  HIGPGR,  and  GPGRAF, 
respectively.  Note  that  the  cyclic  V3-MN  loop  is  a  better  ligand  than  the  linear  analog  in  all  three  cases. 
This  is  consistent  with  the  experimental  evidence  that  the  cyclic  V3-MN  loop  is  more  structured  than  the 
linear  analog.  As  expected,  the  most  pronounced  difference  in  binding  occurs  for  the  mAB  1510  which 
recognizes  the  sequence  KRIHI  on  the  N-terminal  side  of  the  GPG-crest;  this  sequence  also  shows  more 
ordered  structure  upon  cyclization.  For  the  other  two  antibodies,  the  difference  in  binding  is  smaller 
because  both  of  them  include  the  GPGR  which  even  in  the  linear  analog  shows  a  residual  turn. 


Figure  5.  An  ELISA  showing 
the  preference  of  monoclonal 
antibodies  for  tiie  cyclic  over  the 
linear  form  of  the  HIV-MN  V3 
loop.  Human  mAbs  1510 
(Top)  and  1511  (Center)  and 
murine  antibody  1289  (Bottom) 
all  bind  to  a  greater  extent  to  the 
cyclic  V3  loop  peptide.  The 
recognized  epitopes  1510 
(Top),  1511  (Center),  and  1289: 
(bottom)  are  shown  on  the 
right.  In  the  schematic 
representations  of  the  HIV-MN 
V3  loop  shown  on  the  right, 
solid  circles  depict  hydrophobic 
residues,  open  circles  cWged 
residues,  and  outlined  circles 
polar  uncharged  residues. 
BIAcore  measurements  [87]  re¬ 
confirm  the  ELISA  data. 
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Therefore,  the  NMR  and  antibody  binding  studies  imply  that  vaccine  attempts  using  the 
cyclic  V3  loop  would  be  more  effective  than  the  linear  analog  in  inducing  protective  humoral  immunity  to 
the  conserved  structural  features.  The  binding  profiles  of  mAb  1510  and  1511  (both  derived  from  AIDS 
infected  patients)  reinforce  the  notion  that  the  cyclic  V3  loop  presents  the  epitope  stmctures  similar  to  that 
found  in  native  gpl20.  Interestingly,  the  (K10-R11-I12-H13-14-G15-P16-G17)  fragment  which  is  a 
part  of  the  neutralizing  epitope  of  the  cyclic  MN  V3  loop  shows  the  same  structure  in  water  and  in  the 
mixed  solvent  as  in  the  co-crystal  of  the  neutralizing  antibody  (mAb  50.1)  and  the  MN  V3  loop  peptide 
antigen  complex  [88]. 

C.lb  Conserved  Structure  at  the  Immunogenic  Tip  of  the  V3  Loop:  Design  of  a 
Chimeric  Multivalent  HIV  Antigen  that  Contains  Multiple  Copies  of  this  Conserved 
Structural  Element 

We  carried  out  molecular  modeling  and  two-dimensional  (2D)  NMR  studies  on  the  V3 
loops  sequences  shown  in  Figure  4  to  identify  the  structural  features  of  the  V3  loop,  especially  at  the  ND, 
that  remain  conserved  irrespective  of  the  sequence  variation.  The  conserved  structure  of  the  ND  is  a 
solvent  accessible  protruding  motif  or  a  knob  (Figure  6).  Interestingly,  we  also  showed  (Figure  6)  by  2D 
NMR  spectroscopy  [81-82]  that  the  HIV  ND  knobs  are  structurally  isomorphous  with  the 
immunodominant  knobs  in  the  tandem  repeat  protein,  human  mucin  Muc-1  (a  tumor  antigen  for  breast, 
pancreatic,  and  ovarian  cancer).  Each  20  amino  acid  repeat  of  Muc-1  consists  of 
(TSAPDTRAPGSTAPPAHGV).  The  antigenic  knob  of  Muc-1  is  located  at  APDTR.  Therefore,  we 
replaced  the  mucin  antigenic  knobs  by  the  HIV  ND  knobs  in  a  set  of  chimeric  Muc-1  A'^3  antigens.  The 
repeat  sequences  of  the  chimera  are:  (TSG  PGRAFAPGSTAPPAHGVln. 
(IHIGPfiRAFAPGSTAPPAHG)n,  and  (HIGPGRAPAPGSTAPPAHGVln.  The  V3  inserts  are 
underlined.  This  produced  multivalent  HIV  antigens  in  which  NDs  are  located  at  regular  intervals  and 
separated  by  extended  mucin  spacers.  We  have  shown  by  2D  NMR  spectroscopy  that  the  multivalent 
antigens  preserve  the  NDs  in  their  native  structure.  We  have  also  demonstrated  by  enzyme-linked-ELISA 
that  the  antigens  correctly  present  the  NDs  to  produce  binding  with  monoclonal  antibodies  (mAbs)  and 
polyclonal  antisera  from  AIDS  infected  patients.  The  antibody  binding  of  these  chimera  is  equivalent  to 
the  cyclic  form  MN  V3  loop. 

Muc-1  A^3  antigens  are  unique  in  the  following  ways,  (i)  NMR  and  antibody  binding  data 
[81]  verify  that  they  reproduce  the  native  structure  of  the  NDs  even  when  they  are  presented  in  the  context 
of  a  totally  unrelated  protein  like  mucin  Muc-1.  (ii)  Immunogens  containing  identical  NDs  within  the 
Muc-1  chimeras  effectively  allows  enhanced  presentation  of  a  conserved  structural  feature  of  the  virus  in 
a  fashion  not  possible  with  non-chimeric  HIV  antigens.  The  true  advantage  of  this  approach  will  be  to 
induce  either  T-dependent  or  T-independent  antibody  responses  to  the  ND  depending  on  the  precise 
constmction  of  the  antigen,  (iii)  Multiple  NDs,  present  in  these  chimeric  proteins,  may  be  advantageous 
in  enhancing  the  immune  response  by  significantly  increasing  the  affinity  of  antibody  binding.  The 
importance  of  multiple  NDs  being  present  in  the  same  antigen  becomes  clear  by  analyzing  the  relative 
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binding  of  Muc-1A^3  (HIGPGRAPAPGSTAPPAHGV)3.6  peptides  to  different  antisera.  The  data  show 
that  the  120  residue  peptide  is  a  better  ligand  than  the  60  residue  peptide  for  the  majority  of  the  antisera 
we  tested.  This  is  probably  due  to  the  fact  that  the  higher  number  of  ND  knobs  in  the  120  residue 
peptides  are  correctly  disposed  along  the  long  axis  of  the  molecule  to  facilitate  the  binding  of  bivalent 
antibodies,  (iv)  Alternatively,  the  nature  of  the  Muc-1A^3  structure  (Figure  6)  suggests  that  if  two  or 
more  different  NDs  are  grafted  alternately  along  the  chain,  there  is  enough  flexibility  in  the  spacers  such 
that  two  or  more  antibodies  specific  for  two  different  NDs  can  both  bind  bivalendy,  interdigitating  along 
the  molecule.  Finally,  there  is  no  reason  why  more  than  two  NDs  cannot  be  introduced  in  the  molecule. 
This  may  be  critical  in  designing  vaccines  for  a  highly  mutating  pathogen  like  HIV. 


Figure  6.  (A)  Superimposition  of  the  protruding  motifs  of  two  NMR  structures:  the  V3  loop  from  the 
HIV-MN  isolate  (designated  as'MN)  and  that  from  the  Thailand  TN243  isolate  (named  TN).  The 
sequences  of  two  motifs  are:  MN,  RIHIGPGRAFYT  and  TN,  SITIGPGQVFYR.  Note  that  the  GPGR 
or  GPGQ  crests  are  oriented  in  the  same  way.  (B)  The  principle  of  design.  The  V3  sequences  above  the 
Muc-1  sequences  actually  replace  the  Muc-1  residues  in  the  chimeras. 

C.lc  Sequence  Variability  at  the  Two  Ends  of  the  ND:  Camouflaging  of  the  Conserved 
Secondary  Structural  Element 

NMR  studies  on  the  V3  loop  sequences  listed  in  Figure  4  are  summarized  as  follows 
(Figure  7A-C).  (i)  A  GPG  type  II  turn  is  present  at  the  crest  of  the  V3  loop  in  all  the  sequences,  (ii) 
Stretches  of  P-strand  adjacent  to  the  GPG-tum  on  the  N-  and  C-terminal  side  are  common  to  all  the 
sequences,  (iii)  The  residues  in  the  C-terminal  segment  form  a  few  turns  in  water  and  a  helix  in  the  less 
polar  mixed  solvent,  (iv)  In  spite  of  the  constraints  of  secondary  structures  [(i)-(iii)]  and  the  disulfide 
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Figure  7D 


bridge,  the  V3  loop  exhibits  conformational  flexibility  as  evidenced  by  the  absence  of  long  range  NOESY 
interactions  commonly  observed  in  well  folded  globular  proteins.  However,  a  "protruding  knob"  formed 
by  the  central  GPG-tum  and  the  P-strands  on  either  side  emerges  as  the  secondary  structural  feature 
conserved  among  diverse  V3  loop  sequences.  The  single  crystal  structure  of  the  HIV-1  neutralizing 
antibody  (mAb  50.1)  complexed  to  16-residue  long  linear  MN  V3  fragment  shows  the  hint  of  such  a 
"protruding  knob"  although  the  segment  on  the  C-terminal  side  of  the  GPGR  type  turn  remains 
disordered  [88].  The  crystallographic  observation  suggests  that  the  protruding  knob  of  the  V3  loop  that 
includes  the  neutralizing  epitope  might  well  be  specifically  recognized  by  the  antibody.  However,  we 
cannot  count  on  the  fact  that  the  conserved  "protruding  knob"  of  the  V3  loop  will  always  be  presented  in 
its  conformationally  pure  form  because  HIV  will  always  find  a  way  to  mask  this  conserved  secondary 
structural  element.  In  this  work  we  report  one  such  mechanism  of  masking  as  revealed  by  the  "close" 
state  in  Figure  7D.  In  this  form  of  the  Haitian  V3  loop,  the  NMR  data  indicate  an  arching  of  the  residues 
on  the  C-terminal  side  of  the  GPGK-tum.  This  is  a  departure  from  the  "protruding  knob"  motif  that 
contains  the  central  GPG-tum  and  two  P-strands  on  either  side.  Such  an  arched  conformation  of  the 
neutralizing  epitope  has  also  been  observed  in  an  antibody  (mAb  59.1)  complexed  with  a  linear  V3 
fragment  [36].  When  combined  with  the  single  crystal  data  on  mAb-V3  complex,  our  NMR  data  indicate 
that  the  "closed"  or  "arched"  conformation  of  the  neutralizing  epitope  of  the  V3  loop  is  possible  and  can 
be  recognized  by  the  antibody.  In  addition,  our  data  also  indicate  that  an  equilibrium  between  the 
"closed"  and  "open  state"  (Figure  7D)  is  possible.  The  arching  around  A20-F21  tends  to  mask  A20  and 
F21  as  shown  by  the  solvent  exposure  data  of  the  open  and  close  forms  of  the  Haitian  V3  loop.  The 
close  form  of  the  V3  loop  may  camouflage  some  essential  elements  of  the  neutralizing  epitope  from  the 
immune  system.  For  instance,  this  masking  will  interfere  with  the  binding  of  antibodies  that  recognize 
the  PGRAF  epitope.  Most  importantly  such  a  local  masking  of  A20  and  F21  should  affect  the  proteolysis 
of  the  R/Q/K19-A20  peptide  bond  by  thrombin  and  tryptase  [66-67];  the  second  enzyme  lies  on  the  T- 
cell  surface.  When  gpl20  is  used  as  a  substrate  unlike  other  proteases  these  two  enzymes  show 
exceptional  specificity  for  cleavage  of  the  R/Q/K19-A20  peptide  bond  inside  the  V3  loop.  The  most 
striking  is  the  observation  that  the  V3  loops  of  T-cell  tropic  virus  strains  are  1,0(X)  times  more  susceptible 
to  cleavage  by  these  two  enzymes  than  the  V3  loops  of  macrophage  tropic  strains  [67].  The  T-cell  tropic 
V3  loops  are  more  positively  charged  than  the  macrophage  tropic  V3  loops  [53].  Our  studies  reveal  that 
the  open  state  of  the  neutralizing  epitope  of  the  V3  loop  is  exclusively  preferred  for  MN  and  RF  V3  loops 
with  net  charges  >  +5  whereas  the  close  state  of  the  neutralizing  epitope  begins  to  appear  for  the  Haitian 
V3  loop  with  net  charge  of  +3.  Therefore,  we  believe  that  the  proteolysis  data  [66-67]  are  consistent  with 
our  structural  conclusions. 

We  have  carried  out  2D  NMR  and  molecular  modeling  studies  on  three  mini  V3  loops 
which  are  17-amino-acids-long  derived  from  the  MN,  Florida,  and  Thailand  V3  sequences  (Figure  5). 
These  mini  V3  loops  contain  the  central  GPG  and  the  flanking  sequences  that  are  required  for  antibody 
binding.  We  show  that  the  presence  of  the  (S-S)  bridge  between  the  1st  and  the  17th  C  leads  to  a  P- 
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•hairpin  conformation  for  all  three  mini  V3  loops  with  quite  different  sequences.  Therefore,  by  this  design 
the  conformational  camouflaging  can  be  avoided. 

C.2  The  V3  Loop  in  the  Context  of  the  Native  gpl20 

C.2a  An  Insight  into  the  Neutralization  Escape  Mutants  of  HIV-1:  Molecular  Modeling 
and  Neutralization  Assays  on  gpl20  Mutants  with  Single  Site  Mutations  inside  the  V3 
Loop 

A  dramatic  effect  of  a  single  amino  acid  substitution  in  the  V3  loop  was  encountered  by 
monitoring  a  patient  over  a  period  of  time.  The  patient,  in  1985,  had  the  virus  with  the  GPGRA  sequence 
at  the  crest  of  the  V3  loop.  mAb  (M77)  that  binds  to  this  V3  loop  could  neutralize  the  virus.  In  1987,  the 
virus  isolated  from  the  same  patient  showed  only  one  change  in  the  V3  loop,  i.e.,  A21T  substitution.  The 
single  A21T  substitution  had  a  dramatic  effect:  (i)  M77  could  no  longer  bind  the  singly  mutated  V3  loop 
and,  therefore,  (ii)  M77  could  not  neutralize  the  new  virus  of  1987  thus  leading  to  an  escape  mutant  due 
to  a  single  A2  IT  substitution. 

Monte  Carlo  (MC)  Simulated  Annealing  studies  [85-86]  suggested  that  the  residue  21  is 
involved  in  M77  recognition  via  "indirect  reading"  (Figure  8-1)  as  opposed  to  "direct  reading"  (Figiue  8- 
2)  in  which  residue  21  should  make  direct  contacts  in  the  antibody  binding  cavity.  In  the  "indirect 
reading"  mechanism,  the  contact  domain  of  the  V3  loop  (amino  acids  14-27  in  Figure  8)  in  the  native  V3 
loop  forms  a  protruding  moiety  which  shows  surface  complementarity  in  the  antibody  binding  pocket  of 
M77;  A21  is  not  on  the  surface  of  the  binding  domain  but  in  the  specific  cavity  which  can  only 
accommodate  A21  and  not  T21  (Figure  8-1).  Therefore,  A21T  substitution  would  result  in  the  expansion 
of  the  cavity  and  the  smface  area  of  the  contact  domain,  thereby  leading  to  the  loss  of  M77  specificity. 


Figure  8.  "Indirect  Reading"  (1)  and  "Direct  Reading"  (2)  mechanisms  of  antibody  (mAb  M77) 
recognition.  Four  different  residues,  i.e..  A,  T,  S,  and  I,  occupy  the  position  21  of  the  V3  loop. 


If  the  hypothesis  of  "indirect  reading"  mechanism  of  M77  recognition  is  true,  the 
following  predictions  can  be  made:  (i)  A21S  substitution  should  not  alter  the  M77  specificity  because  A 
and  S  have  about  the  same  size,  and  therefore  both  should  be  accommodated  in  the  same  cavity,  and  (ii) 
A21I  substitution  should  result  in  the  loss  of  M77  specificity  because  121  will  be  too  large  to  fit  in  the 
same  cavity  that  snugly  fits  A21.  Both  predictions  were  experimentally  tested  and  were  found  to  be  true 
[85-86].  The  work  was  done  in  collaboration  with  Drs.  Fulvia  Veronese  and  Marjorie  Robert-Guroff  of 
the  NCI/NIH.  Figure  8  shows  that  A21  is  involved  in  interactions  (marked  by  dashed  lines  in  8-1)  with 
the  residues  that  are  distant  from  the  GPG-crest.  Subsequent  to  our  work,  the  crystal  structure  of  a  linear 
V3  epitope  complexed  with  a  broadly  neutralizing  antibody  (mAb  59.1)  has  been  reported  [89].  In  this 
complex,  the  residue  21  also  stays  far  from  the  GPG-crest. 

C.2b  Effect  of  Single  Site  Mutations  at  the  GPGR-Crest  of  the  V3  Loop  of  gpl20 

It  has  been  shown  that  any  mutation  in  the  GPG-sequence  of  the  V3  loop  that  destabilizes 
the  type  II  turn  also  affects  the  fusion  activity  of  the  virus.  This  suggests  that  the  type  II  turn  in  the  V3 
loop  is  critical  in  the  life  cycle  of  the  virus.  The  four  residues  in  GPGR/K/Q  are  numbered  as  Gl,  P2, 
G3,  and  R/K/Q4,  respectively.  The  positions  1  and  3  in  the  (<p,\]/)-plot  show  stereochemical  preference 
for  G.  This  is  especially  true  for  the  position  1  which  should  strictly  prefer  G  and  probably  can 
accommodate  A  with  a  distortion  in  the  turn  [90].  Interestingly  HTV-l  mutant  with  G1->T1  mutation  in 
the  type  II  turn  leads  to  a  non-infectious  virus  for  Sup-Tl  T  cells  [61].  However,  after  40  days  of 
coculture  a  T1->A1  revertant  is  identified.  This  revertant  becomes  infectious  to  Sup-Tl  T  cells.  Note 
that  T->G  mutation  requires  3  base  changes  in  the  codon  while  T->A  reversion  requires  a  single  change 
in  the  second  position  of  the  codon. 

Experiments  discussed  above  and  our  NMR  data  prove  the  importance  of  a  type  n  turn  at 
the  GPG-crest  of  the  V3  loop  for  viral  pathogenesis.  However,  it  remains  unclear  whether  residues  like 
R/K/Q  are  also  critically  needed  following  the  GPG  sequence.  We  (in  collaboration  with  the  NCI/NIH) 
have  set  up  the  following  experiment  (unpublished  data)  in  order  to  examine  the  importance  of  a  basic 
(R/K)  or  a  neutral  (Q)  residue  after  the  conserved  GPG  sequence.  We  have  performed  molecular 
modeling  of  various  V3  loop  s^uences  with  a  GPGE-crest  (instead  of  GPGR/K/Q)  and  our  results 
showed  that  the  global  structure  of  the  V3  loop  remains  the  same  as  well  as  the  type  n  turn  at  GPGE.  We 
have  then  constructed  HIV-1  mutants  with  GPGR->E  mutants  inside  the  V3  loop  of  gpl20.  The  mutant 
virus  replicated  as  well  as  the  wild  one.  The  mutant  virus  also  appears  to  express  the  same  number  of 
gpl20  molecules  on  the  surface  as  the  wild  one.  However,  the  mutant  virus  is  the  NSI  type  while  the 
wild  one  is  the  SI  type.  But  after  a  passage  of  three  weeks  in  the  co-culture,  there  has  been  a 
revertant  population  (70%)  of  the  virus.  This  revertant  virus  appears  to  change  from  GPGE  to  GPGK 
sequence  and  the  virus  becomes  SI  active.  Note  that  E->K  reversion  requires  only  a  single  base  change 
in  the  first  position  of  the  triplet  codon.  This  leads  us  to  conclude  that  R,  K,  or  Q  after  GPG  sequence  is 
important  in  the  life  cycle  of  the  virus. 
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C.3  Molecular  Modeling  of  the  V1-V2  and  V4-C4  Subdomains 

We  have  completed  small-scale  chemical  synthesis  and  purification  of  the  cyclic  V1-V2 
domain  of  the  MN  isolate.  This  domain  is  79  amino-acid-long  (Table  1)  and  contains  two  internal 
disulfide  bridges  of  the  structure  shown  in  Figure  2.  The  presence  of  the  two  disulfide  bridges  is 
confirmed  by  mass  spectroscopy.  Figure  9  shows  the  CD  spectra  of  the  MN  VI- V2  domain  with 
different  TFE  solvent  fractions.  TFE  is  used  to  unmask  the  helical  propensity  of  the  residues  inside  the 
V2  loop  (Figure  2).  A  helical  content  of  15%  is  observed  even  in  the  polar  aqueous  environment.  With 
increasing  fraction  of  TFE  in  the  solvent  mixture  the  helical  content  levels  off  to  23%  at  20%TFE.  This 
probably  implies  that  the  core  helical  segment  is  still  present  in  water.  In  TFE;water  mixture  the  end- 
fraying  is  arrested  and  a  longer  helix  is  stabilized.  Interestingly,  our  secondary  structure  prediction  [77- 
78]  identified  a  helical  stretch  of  16  residues  as  shown  in  Figure  2.  This  agrees  with  the  CD  data  at 
20%TFE.  Our  MC  simulated  annealing  method  has  produced  an  ensemble  of  energy-minimized 
structures  in  which  a  stable  helix  is  located  inside  the  V2  loop.  In  this  model,  the  percentages  of  (3- 
strand  and  turns  are  also  close  to  what  have  been  observed  for  the  V1-V2  domain  in  20% TFE.  The  fact 
that  the  helix  region  (Figure  2;  Table  1)  shows  TFE-induced  transition  we  have  attempted  to  visualize 
how  a  helix->P-strand  transition  can  be  accommodated  in  the  VI- V2  under  the  constraint  of  disulfide 
bridges  and  minimum  changes  in  the  rest  of  the  molecule.  Similarly,  CD  data  [92]  and  secondary 
structure  prediction  algorithm  seem  to  indicate  the  presence  of  a  putative  helix  inside  C4  of  the  V4-C4 
subdomain  (see  Figure  3). 


Figure  9.  (A)  CD 

spectra  of  the  two  (S-S) 
bridged  VI- V2  sub- 
domain  at  three  different 
water/TFE  ratios. 
Helical  content  is 
increased  upon 
increasing  the  TFE  %  in 
the  mixture.  Note  that  a 
12%  helix  is  present  in 
the  structure  even  in 
aqueous  environment. 
pH  6-8  did  not  alter  the 
CD  pattern.  Induction 
of  20%  helix  is 
consistent  with  a  helical 
stretch  spanning  167- 
182  which  can  undergo 

a  helix  to  P -strand 
transition  inside  the  VI- 
V2  sub-domain. 
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We  have  utilized  the  secondary  structure  prediction  algorithm  and/or  CD  data  to  obtain  low- 
energy  folded  models  of  the  doubly  disulfide-bridged  VI- V2  and  V4-C4  loops.  The  methodology 
consisted  of  the  following  steps. 

Stepl  (Prediction  of  Secondary  Structures).  The  secondary  structural  elements  are  predicted  for 
a  cyclic  loop  sequence  (with  two  or  more  disulfide  bridges)  by  computing  the  probability  5  of  a  given 
residue  i  in  the  loop  to  adopt  a  /k-type  of  conformation  (k  =  'h'  for  helix;  k  =  'b'  for  beta  sheet;  k  =  'c'  for 
coil;  k  =  't'  for  turn),  where 

Y  P(k,i+l) 

S(k,i)  =  2  _  . 

-y  II 1+1 

(The  summation  is  over  1  =  -y  to  y,  where  y  =  size  of  the  window  chosen  to  account  for  the  effect  of  the 
neighboring  amino  acid  residues:  y^  5  for  h;  =3  for  b;  and  =  4  for  c  or  t.)  P(k,  i)  =  potential  for  the  k 
type  of  conformation  of  individual  residue  i  derived  from  the  analysis  of  the  single  crystal  structures  of 
about  65  proteins.  The  highest  S(k,i)  determines  the  conformation  k  for  the  i  residue.  Use  of  any  existing 
algorithm  for  secondary  structure  prediction  is  only  60%  accurate.  In  order  to  improve  accuracy,  we  test 
our  predictions  by  requiring  an  S-S  bridge  formation  that  achieves  local  energy  minima  for  the  loop;  this 
leads  to  the.next  in  our  method. 

Step  2  (Generation  of  the  Energy -Minimized  S-S-Bridged  Loop).  This  step  involves 
obtaining  an  energetically  stable  S-S-bridged  structure  for  a  V3  loop  sequence  given  the  secondary 
structural  states  of  the  constituent  amino  acids  residues  as  obtained  after  step  1.  Appropriate  ranges  of  (tp , 
V)  values  are  assigned  to  all  amino  acids.  For  example, 

9  =  -550+25®,  v  =  -550±250  for  residues  in  a  helix, 

9  =  -140®+  30®,  \j/  =140®  +30®  for  residues  in  a  beta  strand, 

9  i+l=-65®+  20®,  Yi+i=-50®±20»  9  i+2=-90®+20®,  \|/i+2=0®±20® 
for  residues  in  a  type-I  turn. 


9  i+i=-65®±20®,  \|ri+l=120®±20®,  9  i+2  =90®  ±20®,  \|;i+2=0®±20® 
for  residues  in  a  type-II  turn. 

Residues  in  the  coil  state  are  set  free  to  choose  any  point  in  the  allowed  space.  We  simplify  the  sequence 
by  assuming  A  for  residues  with  side  chains  extending  beyond  cP  atom,  except  for  the  Ps  and  the  terminal 
Cs.  Our  rationale  for  doing  this  is  that  the  allowed  (9,  \|r)  space  of  residues  with  a  side  chain  longer  than  A 
is  only  a  subspace  of  that  allowed  for  A. 
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We  have  obtained  an  S-S-bridged  structvire  of  a  loop  by  using  a  linked-atom-least-square 
refinement  equation  that  minimizes  function  F  in  the  space  (9,  \|/ ): 

F=  \  X\Gl  +  (dijmn-Dmn)2 

where  GX  (=  I  rj  -  r^il  =  0)  indicates  distance  constraints  for  an  S-S  bridge.  Distances  in  the  S-S-bridged 
V3  loop  configuration  are  defined  as  r]^=  S(C1)  -  S(35),  T2=  cP(Cl)  -S(C35),  CP(C35)  -  S(C1),  and 

T^=  CP(CI)  -  CP(C35);  corresponding  equilibrium  distances  are  r^i  =  2.04  A,  102  =  i«3  =  3.05  A,  104  = 
3.85  A.  Xi  indicates  Lagrangian  multipliers;  dij“”  indicates  distance  between  atom  i  (type  m)  and  atom  j 
(type  n);  and  Di™  indicates  the  contact  limit  between  atom  (type  m)  and  atom  (type  n).  In  this  refinement 
(9,  X]/)  of  various  residues  are  treated  as  elastic  variables  (i.e.,  variables  with  weights)  such  that  by 
appropriate  choice  of  weights  the  predicted  secondary  structural  states  of  residues  (after  step  1)  are 
minimally  altered.  This  method  guarantees  a  stereochemically  orthodox  structure  for  the  S-S-bridged 
sequence.  Finally,  appropriate  sidechains  are  attached  (in  place  of  As)  to  generate  an  actual  sequence  and 
the  potential  energy  of  the  system  is  minimized  in  the  (9,  \|/,  0),  x)-space  using  the  force-field  of  Scheraga 
and  co-workers.  The  total  conformational  energy,  ETOT  (KCal/Mole),  has  the  following  components: 

ETOT  =  EES  (Coulomb  interactions  between  pairs  of  partial  charges,  dielectric  constant  =  80) 

-I-  ETOR  (Torsional  energy  due  barriers  around  single  and  partially  double  C-N  bonds) 

+  ENB  (van  der  Waal  attraction  and  repulsion  terms  between  non-bonded  atom-pairs) 

+  ESS  (constraint  energy  due  to  S-S  bonds) 

+  EDIS  (energy  due  to  distance  constraints  as  present  in  different  secondary 
Several  starting  configurations  are  chosen  within  the  allowed  domains  of  the  (9,\|/)-space. 

Step  3  (Exploration  of  the  Conformational  Flexibility  By  Monte  Carlo  Simulated 
Annealing).  The  simulated  annealing  is  performed  in  the  following  manner.  First,  a  starting  energy- 
minimized  structure  is  chosen  and  Monte  Carlo  (MC)  simulations  are  performed  for  50,000  steps  at  600K 
in  the  (9,  y,  co,  %)-space  and  the  lowest  energy  configuration  is  stored.  Second.  50,000  MC  steps  are 
repeated  in  several  cycles  of  gradually  decreasing  temperature  until  a  temperature  of  lOOK  is  reached. 
Third,  the  lowest  energy  configuration  at  1(X)K  is  further  energy  minimized  to  a  low  energy-gradient. 

Steps  1-3  are  repeated  for  several  different  starting  configurations.  This  results  in  several 
low-energy  structures  of  V1-V2  and  V4-C4.  If  these  structurees  belong  to  the  same  energy  basin,  an 
average  model  can  be  obtained  that  represents  a  given  global  fold. 

C4.  A  Model  of  gpl20 

We  have  used  molecular  modeling  to  construct  a  model  of  MN-gpl20.  Details  are  given 
in  reference  86.  Briefly,  our  methodology  is  as  follows.  The  mature  form  of  the  gp  120  protein  of  the 
MN  variant  of  HIV- 1  contains  513  residues.  At  present,  we  have  modeled  a  fragment  of  gpl20  spanning 
the  V1-V2-C2-V3-C3-V4-C4  region  (see  Figure  1);  the  Cl  and  V5/C5  fragments  are  ignored.  This 
fragment  contain  6  disulfide  bridges.  The  first  six  C's  define  the  VI  and  V2  loops  (VI -V2),  the  9th  to 
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'  12th  C's  lock  a  central  domain  of  the  C2  region  (C2b),  the  13th  and  14th  C's  define  the  V3  loop  and  the 
last  four  C's  form  the  V4  and  the  C4  loops  (V4-C4).  We  will  use  the  following  steps  to  model  the 
fragment: 

(1)  We  have  chosen  the  average  energy-minimized  models  of  the  following  sub-domains  except  for  V3 
for  which  we  have  used  the  NMR  model: 

(a)  the  V1-V2  region  (residues:  118-210) 

(b)  the  C2b  region  (residues:  223-252) 

(c)  V3  loop  (residues:  301-335) 

(d)  the  V4-C4  region  (residues:  381-445) 

(2)  We  have  joined  these  sub-domains  with  the  appropriate  linkers  (i.e.,  C2,  C3  etc.,  as  shown  in  Figure 

1). 

(3)  We  have  performed  MD  simulated  annealing  to  generate  the  (e)  V1-V2-C2-V3  (residues:  11 8-335) 
and  the  (f)  V3-C3-V4-C4  (residues:301-445)  fragments.  V1-V2,  C2b,  V4-C4,  and  V3  have  been  varied 
around  their  average  stmctures  in  accordance  with  their  observed  conformational  flexibility  as  revealed  by 
NMR  or  modeling.  We  have  started  with  linkers  in  fully  extended  conformations.  These  regions  include 
the  N-  and  C-terminal  peptides  of  the  C2  region  (residues:  211-222,  C2a  region,  and  residues:  253-300, 
C2c  region).  The  two  fragments  have  then  been  subjected  to  2(X)  ps  of  molecular  dynamics  (MD)  at  500 
K  using  united  atom  Amber  4.0  force  field.  We  have  used  a  bulky  terminal  group  of  van  der  Wall  radius 
5  A  to  model  minimal  N-glycosylation.  100  structures  (i.e.,  one  after  every  2  ps)  have  been  selected 
from  a  200-ps  MD  trajectory  after  equilibration.  Representative  structures  for  each  fragment  have  been 
annealed  and  then  minimized  using  2000  steps  of  conjugate  gradients. 

(4)  Once  folded,  the  two  fragments  have  been  fused  together  by  superimposing  the  helical  region  of  the 
V3  loop  present  in  both  fragments.  The  stmcture  have  then  been  minimized  to  relieve  initial  strains  from 
the  docking  of  the  two  structures  and  equilibrated  at  300  K  for  50  ps  of  molecular  dynamics.  The 
configuration  at  the  end  of  the  50-ps  trajectory  have  been  annealed  and  minimized  for  2000  conjugate 
gradient  steps.  This  has  resulted  in  an  energy- minimized  model  of  the  (V1-V2-C2-V3-C3-V4-C4) 
fragment  of  gpl20.  We  have  repeated  this  step  with  several  different  starting  models  of  the  (V1-V2-C2- 
V3)  and  (V3-C3-V4-C4)  fragments. 

(5)  An  ensemble  of  energy-minimized  structures  of  the  (C1-V1-V2-C2-V3-C3-V4-C4)  have  been  tested 
against  the  surface  accessibility  data  from  the  immunochemical  maps  [21-39,  69]  and  long-range 
interaction  data  from  various  functional  assays  [40-50]. 

(6)  The  screened  gpl20  models  have  been  analyzed  to  study  (i)  the  overall  tertiary  folding,  (ii)  the 
structures  of  various  linear  epitopes,  (iii)  specific  interactions  stabilizing  intradomain  interactions,  (iv) 
masking  of  the  constant  regions  by  the  variable  regions,  and  (v)  the  role  of  inter-domain  interactions  in 
creating  discontinuous  epitopes.  Finally,  a  few  of  the  critical  inter-domain  interactions  have  been 
identified  for  the  experimental  testing  of  the  model. 

We  have  used  the  Molecular  Surface  Package,  version  2.6.2.,  for  computing  the  solvent 
accessible  surfaces.  For  the  solvent  accessibility  calculations  a  probe  sphere  of  1.5  A  of  radius  has  been 
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Figure  10.  A  ribbon  diagram  of  the  (V1-V2-C2-V3-C3-V4-C4)  fragment  of  MN  gpl20.  This  is  a 
representative  model  of  gpl20  that  is  consistent  with  the  data  from  immunochemical  maps  and  other 
functional  assays.  Color  coding:  green  for  VI,  blue  for  V2,  red  for  V3,  magenta  for  C4,  gray  for  C2  &C3, 
and  yellow  for  the  (S-S)  bridges. 
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Figure  11  A.  Inter-domain  interactions  of  the  residues  in  the  ^1-V2)  sub-domain.  Color  coding:  purple 
for  VI,  yellow  for  the  (S-S)  bridges,  cyan  for  the  linear  epitope  in  V2,  purple  for  the  conformational 
epitope  in  V2,  gray  for  the  rest  of  the  V2,  green  for  the  interacting  polar  residues,  and  red  for  the 
interacting  non-polar  residues.  Note  that  the  linear  V2  epitope  adopts  a  helix  whereas  the  conformational 
V2  epitope  adopts  a  b-haiipin  in  the  folded  gpl20.  Also  note  that  the  b-hairpin  is  involved  in  several 
inter-domain  long-range  interactions  with  the  residues  in  C2  and  V4.  The  turn  at  residues  135-140  of  VI 
makes  contact  with  W427  in  C4;  W427  is  critical  for  the  binding  of  the  CD4-blocking  antibodies. 
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Figure  IIB.  Long-range  inter-domain  interactions  involving  residues  from  the  VI,  V3,  and  C4  loops. 
Color  coding:  purple  for  V4,  blue  for  C4,  yellow  for  the  (S-S)  bridges,  green  for  interacting  charged 
residues  from  the  VI,  V3,  and  C3  regions,  red  for  the  interacting  non-polar  residues  from  the  VI,  V3, 
and  C3  regions.  A  turn  at  VI,  the  N-terminal  V3  segment,  and  a  helical  stretch  in  C4  form  an  intricate 
contact  interface.  P437  is  critical  in  inducing  a  sharp  turn  in  C4. 


used.  To  compute  the  residue-specific  accessible  surfaces,  the  accessible  surface  for  each  residue  (X)  is 
normalized  by  the  surface  of  the  same  residue  (X)  in  a  fully  extended  conformation  of  G-X-G. 

Figure  10  shows  a  representative  low  energy  structure  of  the  V1-V2-C2-V3-C3-V4-C4 
fragment  of  gpl20.  It  is  clear  from  Figure  10  that  the  tip  of  the  V3  loop,  the  helix  inside  the  V2,  and  a 
part  of  the  C4  loop  are  all  solvent  exposed.  Theoretical  solvent  accessibility  data  are  computed  for  all  the 
residues  in  our  model  of  the  V1-V2-C2-V3-C3-V4-C4  fragment;  note  that  (see  Appendix  I)  the  computed 
values  agreed  well  with  the  observed  data  [69].  Table  3  lists  a  set  of  H-bonding  interactions  stabilizing 
intra-  and  inter-domain  stmctures.  Three  types  of  H-bonds  in  Table  3  originate  from  three  different  types 
of  interactions,  namely,  backbone-backbone,  backbone-sidechain,  and  sidechain-sidechain  (the  last  two 
interactions  are  sequence  specific).  For  example,  in  our  model  (Figure  11)  OH-Y435(of  C4)  is  H- 
bonded  to  ODl-D22(of  VI)  and  NZ-K341(of  C3)  is  H-bonded  to  OE1/OE2-E440  (of  C4).  Table  3  also 
lists  key  hydrophobic  interactions  involving  residue-pairs  from  different  domains.  Most  of  these 
interactions  involve  residue-pairs  inside  a  tight  cavity  showing  either  van  der  Waal  contacts  [e.g.,  I186(of 
V2)— 1337  (of  V3)]  or  stacking  overlaps  [e.g.,  Y306  (of  V3)  and  W427  (of  C4)].  Figure  11  shows  the 
inter-domain  interactions  involving  (A)  (V3  and  V1/V2  )  and  (B)  (V3  and  C4)  ,  i.e.,  N*135-T136- 
T137-N138-N139-N140  (of  VI),  N*300-C301-T302-R303-P304-N305-Y306-N307  (  of  V3),  and 
Q422-I423-I424-N425-M426-W427-Q428-E429-V430-G431-K432-A433-M434-Y435-A436(ofC4)- 
-  key  residues  in  these  segments  are  marked  in  bold  (N*=glycosylated  N)*  As  shown  in  Figure  1 1  A,  the 
linear  V2  epitope  adopts  a  helix  whereas  the  conformational  V2  epitope  adopts  a  P-hairpin  in  the  folded 
•  gpl20.  Also  note  that  the  p-hairpin  is  involved  in  several  inter-domain  long-range  interactions  with  the 
residues  in  C2  and  V4.  The  turn  at  residues  135-140  of  VI  makes  contact  with  W427  in  C4;  W427  is 
critical  for  the  binding  of  the  CD4-blocking  antibodies.  Experimental  data  also  support  the  presence  of 
(VI  and  C4)  interactions  in  the  native  gpl20.  Table  3  shows  that  K1 83/Ll  84/D  185  of  the  V2  loop  are  all 
involved  in  inter-domain  H-bonds.  The  C=0  of  LI 84  is  H-bonded  to  the  basic  sidechain  of  R278 
whereas  the  acidic  sidechain  of  D 185  is  H-bonded  to  the  basic  sidechain  of  K212;  also  the  sidechains  of 
E274  and  LI 84  are  within  5  A.  Therefore,  if  such  a  P-hairpin  is  critical  for  antibody  binding  a  double 
site  mutation,  L184/D185-D184/L185,  would  be  catastrophic  because  this  would  bring  D184  close  to 
E274  and  LI  85  in  steric  clash  with  K212  (both  of  which  would  destabilize  the  P-hairpin).  Indeed,  a 
L184/D185-D184/L185  double  mutation  reduces  the  binding  affinity  of  antibodies  specific  for 
conformational  epitopes  inside  the  V2  loop.  Table  3  shows  that  R303  and  N*300  form  sidechain- 
sidechain  H-bonding  which  stabilizes  a  turn  at  residues  300-303.  This  turn  (as  shown  in  Figure  11) 
brings  the  Y306  ring  in  close  proximity  with  the  W437  ring.  The  turn  at  residues  300-303  also  locks 
R303  in  H-bonding  interactions  with  (G441  and  Q442)  and  N*300  in  H-bonding  interactions  with  (A436 
and  P437)"  see  Table  3.  The  key  residues,  W427  and  Y435,  also  show  sidechain-backbone  H-bonds 
with  N138  and  N*140  (Table  3).  Therefore,  we  predict  that  the  residues  N*300  and  R303  of  V3  and  the 
residue  W427  of  C4  are  critical  in  bringing  VI,  V3,  and  C4  in  spatial  proximity.  Interestingly  from 
antibody  binding  studies,  it  has  been  demonstrated  that  R303G  substitution  exposes  the  N-terminal  V3 
fragment;  from  our  model  we  predict  that  such  a  substitution  would  abolish  several  key  inter-interactions 
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'  involving  the  (VI  and  V3)  and  (V3  and  C4)  loops.  Similarly,  antibody  binding  studies  reveal  that 
W427S  substitution  exposes  the  residues  420-435  in  the  C4  fragment;  from  our  model  we  also  predict 
that  such  a  substitution  would  disable  several  key  inter-interactions  involving  the  (VI  and  V3)  and  (V3 
and  C4)  loops. 

Additional  data  on  gpl20  modeling  in  Appendix  I  include  the  following. 

(i)  (Figure  12)  The  MN-gpl20  sequence  is  aligned  with  IIIB-gpl20  sequence  (note  that  most  of  the 
experimental  data  have  been  obtained  for  IlllB-gpl20). 

(ii)  (Figure  13)  A  ((p,\|r)  plot  of  the  residues  in  V1-V2-C2-V3-C3-V4-C4  as  shown  in  Figure  10  (this 
plot  shows  that  most  of  the  residues  fall  within  the  allowed  region  of  the  plot). 

(iii)  (Figure  14)  Solvent  accessibility  of  various  contiguous  amino  acid  stretches  computed  for  VI- 
V2-C2-V3-C3-V4-C4  as  shown  in  Figure  10  (the  computed  values  agree  well  those  obtained  from  the 
immunochemical  maps  of  gpl20). 

(iv)  (Figure  15)  A  close-up  view  of  the  (A)  VI -V2  and  (B)  (V4-C4)  loops  taken  from  Figure  10 
(note  that  the  average  low-energy  models  of  these  two  loops  are  also  quite  similar). 

(v)  (Figure  16)  A  close-up  view  of  the  P-hairpin  (residues  183-188)  that  is  an  important  part  of  the 
conformational  epitope  inside  V1-V2  (the  effect  of  L184/D185-D184/L185  mutations  are  described). 

(vi)  Table  4  Listing  of  inter-domain  H-bonds  involving  backbone-backbone,  backbone- sidechain, 
and  sidechain-sidechain  interactions. 


Appendix  II  includes  our  structural  biology  work  on  the  human  mucin,  Muc-1,  a  breast 
cancer  tumor  antigen.  This  tandem  repeat  protein  is  critically  important  in  breast  cancer  research,  a  major 
research  initiative  in  the  US  Army. 
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APPENDIX  I 


Figlire  12.  In-frame  alignment  of  MN  and  DIB  gpl20.  Note  that  most  of  the  immunochemical  and  other 
functional  assays  are  performed  on  single  and  double  site  mutants  of  IHB  gpl20  [21-50]. 
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Figure  13.  ((p,\|r)  values  of  the  residues  in  the  gpl20  model  of  Figure  10.  Note  that  most  of  the  residues 
lie  within  the  allowed  regions  of  the  Ramachandran  plot.  A  few  non-Glycine  residues  in  the  disallowed 
region  (i.e.,  the  center  of  the  right  bottom  quadrant)  are  marked.  This  plot  proves  the  stereochemical 
feasibility  of  our  model  and  the  validity  of  our  conformational  search. 
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Figure  14.  Site-specific  solvent  accessible  surface  area  (in  A^).  The  MAS  (maximum  accessible  surface) 
values  are  given  for  the  most  exposed  non-hydrogen  atom  in  a  residue  after  scaling  it  by  the  surface  area  of 
the  same  atom  in  the  same  residue  (X)  of  the  extended  G-X-G.  Therefore  MAS  values  well  above  1 
implies  well  exposed  residues.  Color  coding:  green  for  VI,  cyan  for  V2,  black  for  C2  and  C3,  red  for 
V3,  magenta  for  V4,  and  blue  for  C4.  Observed  accessibility  data  for  different  segments  of  gpl20  are 
shown  as  bars,  i.e.,  the  bars  with  MAS  >  4  represent  well  exposed  segments,  the  bars  with  MAS  ~1 
represent  poorly  exposed  segments,  and  the  bars  with  MAS  ~  0  represent  buried  segments.  The  location 
and  length  of  the  bars  define  the  epitope  on  gpl20.  Note  that  the  calculated  surface  accessibility  areas  of 
the  epitopes  inside  the  VI,  V2,  V3,  V4,  and  C4  loops  agree  well  with  the  observed  data  [69]. 
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Table  4.  Summa^  of  intra-  and  inter-domain  backbone-backbone  (bb),  backbone-sidechain  (bs),  and 
sidechain-sidechain  H-bonds.  The  cut-off  for  the  donor-acceptor  distance  is  3  A  and  the  cut-off  for  the 
donor-proton-acceptor  angle  is  1200.  The  presence  of  a  large  number  of  intradomain  H-bonds  inside  the 
C2  and  C3  sub-domains  imply  that  they  form  well  folded  structures.  Also  note  that  there  are  a  significant 
number  of  inter-domain  H-bonds  between  the  C2  and  C3  sub-domains  (which  constitute  a  part  of  the 
discontinuous  epitope  for  CD4  binding).  The  number  of  intra-  and  inter-domain  H-bonds  are  averages 
over  sampled  gpl20  models  that  are  consistent  with  the  data  from  the  immunochemical  maps  [69]  and 
other  function^  assays  [40-50]. 


VI  V2  C2  V3  C3  V4  C4 

VI  16  4  6  0  0  0  0  ,■  bb 

14  8  6  0  0  0  2  bs 

10  4  4  3  0  0  1  ss 

V2  4  26  4-  0  0  0  0  bb 

8  9  9  2  0  0  Obs 

4  10  6  0  0  0  0  ss 

C2  6  4  28  0  1  0  1  bb 

6  9  41  5  2  0  2  bs 

4  6  16  5  0  0  1  ss 

V3  0  0  0  16  1  4  0  bb 

0  2  5  14  0  11  4  bs 

3055060  ss 

C3  0  0  11  16  4  Ibb 

0  0  2  0  29  2  3  bs 

0000928  ss 

r 

V4  0  0  0  4  4  7  0  bb 

0  0  0  11  2  12  7  bs 

0006221  ss 

C4  0  0  1  0  1  0  17  bb 

2024375  bs 

10108  13  ss 
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iose-up  view  of  the  VI 
i^ellow)  are  present  in  i 
nagenta  and  the  rest  in 


Figure  15B.  A  close-up  view  of  the  V4-C4  loop  taken  from  Figure  10;  color  coding:  V4  in  green,  C4  in 
magenta,  and  the  two  disulfide  bridges  in  yellow. 
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Figure  16.  A  part  of  the  P-hairpin  (residues  183-188).  Note  that  K1 83/Ll  84/D  185  of  the  V2  loop  are  all 
involved  in  inter-domain  H-bonds  (see  also  Table  3).  The  C=0  of  L184  is  H-bonded  to  the  basic 
sidechain  of  R278  whereas  the  acidic  sidechain  of  D1 85  is  H-bonded  to  the  basic  sidechain  of  K212;  also 

the  sidechains  of  E274  and  LI 84  are  within  5  A.  Therefore,  if  such  a  P-hairpin  is  critical  for  antibody 
binding  a  double  site  mutation,  L184/D185-D184/L185,  would  be  catastrophic  because  this  would  bring 
D184  close  to  E274  and  L185  in  steric  clash  with  K212  (both  of  which  would  destabilize  the  P-hairpin). 
Indeed,  a  double  L184/D185-D184/L185  mutation  reduces  the  binding  affinity  of  antibodies  specific  for 
conformational  epitopes  inside  the  V2  loop  [33-39]. 
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APPENDIX  II 


Structural  Studies  on  the  Human  Mucin,  Muc-1:  effect  of  0-Glycosylation  at 
Threonine  Residues 

Human  mucins  are  a  family  of  high  molecular  weight,  heavily  0-glycosylated  proteins 
which  are  dominated  by  large  tandem  repeat  domains.  Mucin  tandem  repeat  domains  vary  in  size,  proline 
content,  and  potential  extent  of  0-glycosylation.  Underglycosylation  of  the  human  mucin  Muc-1  tandem 
repeat  domain  in  certain  breast,  pancreatic  and  ovarian  tumors  results  in  the  unmasking  of  protein  core 
epitopes.  Tumor  reactive,  mucin-specific  monoclonal  antibodies  reveal  differences  between  the  surface 
of  Muc-1  derived  from  tumors  and  normal  tissues.  Synthetic  peptide  studies  show  that  most  tumor 
specific  antibodies  recognize  an  epitope  within  the  tandem  repeat  protein  core  of  Muc-1.  Two- 
dimensional  nuclear  magnetic  resonance  experiments  are  performed  on  chemically  synthesized  mucin 
tandem  repeat  polypeptides,  (PlD2T3R4P5A6P7G8S9T10AllP12P13A14H15G16V17T18S19A20)n 
for  n=l,3.  These  studies  demonstrate  how  the  tandem  repeats  assemble  in  space  giving  rise  to  the  overall 
tertiary  structure,  and  the  local  stmcture  and  presentation  of  the  (underlined)  antigenic  site  (APDTR)  at  the 
junction  of  two  neighboring  repeats.  The  NMR  data  (Figure  17)  reveal  repeating  knob-like  structures 
connected  by  extended  spacers.  The  knobs  protrude  away  from  the  long-axis  of  Muc-1  and  the 
predominant  antigenic  site  (APDTR)  forms  the  accessible  tip  of  the  knob.  Multiple  tandem  repeats 
enhance  the  rigidity  and  presentation  of  the  knob-like  structures.  We  have  enzymatically  0-glycosylated 
the  Muc-1  tandem  repeat  (PDTRPAPGSTAPPAHGVTSA)3.  The  glycosylation  occurs  preferentially  at 
the  T-residues  except  at  the  T  belonging  to  the  immunodominant  (APDTR)  knob.  Detailed  NMR 
analyses  show  that  the  pentapeptide,  (APDTR)  maintains  the  same  turn  as  in  the  unglycosylated  form 
while  additions  of  GalNAc  at  TIO  and  T18  in  the  glycosylated  mucin  alter  the  local  structure  (Figure  18). 
Also  the  glycosylated  structure  is  less  flexible. 
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Figure  17.  2D  NOESY  crosssection  of  the  0-linked  GalNaC-conjugated  TIO  and  T18  of  Muc-1, 
(P1D2T3R4P5A6P7G8S9T10A1 1P12P13A14H15G16V17T18S19A20)3.  Mixing  time  =  400  ms, 
temperature  =10  ^C. 
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Immunodominant  Knobs 


Figure  18.  (top)  A  schematic  representation  of  Muc-1  in  which  the  immunodominant  knobs  containing 
(APDTR)  are  connected  by  extended  spacers,  (bottom)  Solution  stracture  of 
(P1D213R4P5A6P7G8S9T10A11P12P13A14H15G16V17T18S19A20)3  (shown  in  cyan)  in  the 
unglycosylated  form.  Solution  stracture  of 

(P1D2T3R4P5A6P7G8S9T10A1 1P12P13A14H15G16V17T1 8S 1 9A2Q13  (shown  in  green  and  brown)  in 
which  TIC)  and  T18  in  each  repeat  are  GalNaC-conjugated.  The  fact  that  T3  of  APDTR  escapes 
glycosylation  by  GalNaC  transferase  implies  that  T3  is  part  of  a  structured  (and  less  exposed)  element. 
GalNaC  moieties  are  modeled  using  the  force  field  parameters  from  refs.  93-94. 
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Abstract 

The  principal  neutralizing  determinant  (PND)  of  human  immunodeficiency  virus  (HIV)  is 
located  inside  the  third  variable  loop  (designated  the  V3  loop)  of  the  envelope  glycoprotein 
gpl20.  The  V3  loop  is  typically  35  amino-acids  long,  and  the  1st  and  the  35th  residues 
in  the  loop  are  invariant  cystines  involved  in  a  disulfide-bridge.  Although  PNDs  from  dif¬ 
ferent  HIV  isolates  contain  a  conserved  GPG-sequence,  the  amino  acids  flanking  the  conser¬ 
ved  sequence  show  hypervariability  among  HIV  isolates;  the  GPG  and  the  two  flanking 
regions  are  collectively  referred  to  as  the  GPG-crest  or  the  PND.  The  amino  acid  sequence 
variability  in  the  GPG-crest  gives  rise  to  different  antigenic  specificities  for  different  PNDs 
from  different  HIV  isolates.  By  combining  two-dimensional  nuclear  magnetic  resonance 
(2D  NMR)  and  molecular  modeling  techniques,  we  have  developed  a  method  to  study  ( 1 )  the 
global  tertiary  fold  of  the  V3  loops  of  HIV  and  (2)  the  local  structure  of  the  PND  at  the  tip  of 
the  V3  loop.  In  this  article,  we  report  the  results  of  our  structural  studies  on  the  V3  loop  of  a 
Thailand  HIV  isolate.  The  sequential  assignment  is  made  by  combining  DQF-COSY,  TOCSY, 
and  NOESY/ROESY  experiments.  Various  intra-  and  inter-residue  inter-proton  distances 
are  estimated  by  full-matrix  analyses  of  the  NOESY  data  at  100  and  400  ms  of  mixing  times 
and  of  the  ROESY  data  at  60  and  200  ms  of  mixing  times.  100  inter-residue  distances  are  used 

Abbreviations:  Human  Immunodeficiency  Virus  (HIV),  Principal  Neutralizing  Determinant  (PND). 
Correlation  Spectroscopy  (COSY),  Double  Quantum  Filtered  COSY  (DQF-COSY),  Total  COSY  (TOCSY), 
Nuclear  Overhauser  and  Exchange  Spectroscopy  (NOESY),  Rotating  Frame  Nuclear  Overhauser  and 
Exchange  Spectroscopy  (ROESY). 
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as  structural  constraints  in  a  simulated  annealing  procedure  to  derive  energetically  stable  stmc- 
tures.  Two  functional  motifs  in  the  V3  loop,  i.e.,  the  glycosylation  site  and  the  GPG-crest  form 
defined  structures,  a  turn  is  located  at  the  glycosylation  site,  and  the  GPG-crest  forms  a  protruding 
domain  with  a  type-II  GPGQ  turn.  The  other  regions  of  the  V3  loop  are  rather  flexible — especially 
the  C-temunal  DIRKAYC-stretch.  These  flexible  regions  of  the  V3  loop  lead  to  conformational 
flexure  of  the  entire  V3  loop  without  altering  the  local  stmctures  of  the  glycosylation  site  or  the 
GPG-crest.  However,  the  ROESY  experiments  revealed  no  slow  exchange  among  different  V3 
loop  conformations,  and  therefore  the  flexible  conformations  are  in  fast  exhange  within  the  NMR 
time  scale.  The  extent  of  this  conformational  flexibility  is  also  discussed. 

Introduction 


Studies  on  the  feasibility  of  a  subunit  vaccine  to  protect  against  HIV-1  infection 
have  mainly  focused  on  the  outer  envelope  glycoprotein,  gpl20  [1].  The  PND  is 
located  inside  the  V3  loop  of  gpl20  [2].  Antibodies  elicited  by  the  PND  block  virus 
infectivity,  thus  neutralizing  the  virus  [3-5].  Neutralizing  antibodies  can  also  block 
viral  infection  by  inhibiting  fusion  of  HIV-infected  cells  with  CD4-positive  uninfec¬ 
ted  cells  [6] .  The  role  of  the  PND  in  virus  neutralization  and  inhibition  of  cell  fusion 
has  made  them  the  focus  of  intense  research  in  the  development  of  vaccine  directed 
against  HIV  infection.  However,  progress  in  vaccine  development  has  been  impeded 
by  amino  acid  sequence  variability  among  different  HIV  isolates,  particularly  in  the 
V3  loop  [7].  Neutralizing  antibodies  elicited  by  the  PND  from  one  HIV  isolate  do 
not  neutralize  other  HIV  isolates  [8].  Sequence  and  structure  analyses  of  the  V3 
loops  from  a  large  number  of  HIV  isolates  are  therefore  required  to  accurately 
define  global  and  local  structural  differences  among  different  V3  loop  sequences. 
Because  an  antibody  recognizes  a  specific  three-dimensional  structure  of  the  antigen 
[9],  performing  the  above  analyses  is  central  to  understanding  the  specific  interac¬ 
tions  that  govern  PND-antibody  complex  formation.  Such  studies  are  also  relevant 
because  the  (S-S)-bridged  V3  loop  is  the  smallest  part  of  gpl20  that  is  likely  to  pre¬ 
sent  the  PND  to  the  antibody  in  a  manner  similar  to  the  entire  envelope  protein.  A 
systematic  study  of  several  different  V3  loop  sequences  will  allow  us  to  map  the 
amino  acid  sequence  variability  in  terms  of  the  global  structure  of  the  entire  V3  loop 
and  the  local  structures  of  various  functional  motifs  located  inside  the  V3  loop,  i.  e., 
the  glycosylation  site  and  the  GPG-crest.  In  this  article  we  outline  the  results  of  our 
studies  on  the  Thailand  TN243  viral  V3  loop  sequence  (Figure  1)  combining  2D 
NMR  and  molecular  modeling  techniques.  This  V3  loop  sequence  which  is  quite 
different  from  the  viral  V3  loop  sequence  found  in  North  America  [8],  may  account 
for  the  virulence  and  tranmissibility  of  this  vival  strain  [25,26]. 

Materials  and  Methods 

Chemical  Synthesis  and  purification 

The  V3  loop  sequence  was  reported  by  McCutchan  et  a/.  [25].  First,  the  36-amino-acids- 
long  linear  peptide  was  synthesized  using  Merrifield's  solid  phase  synthesis  protocol. 
Two  cystine  residues  (i,e.,  C2  and  C36)  involved  in  the  (S-S)-bridge  were  protected  using  a 
methyl  benzyl  group  (MeBzl),  and  the  terminal  C 1  was  protected  by  using  an  acetamide- 
methyl  group  (ACM).  The  MeBzl  groups  on  C2  and  C36  were  removed  by  using 
anhydrous  hydrogen!  fluoride  (HF)  while  the  terminal  Cl  was  still  blocked.  A  pro- 
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longed  oxidation  experiment  was  carried  out  for  the  formation  of  the  (S-S)“bridge 
between  C2  and  C36.  After  the  completion  of  the  oxidation  step,  the  cyclic  V3  loop 
was  purified  by  analytical  HPLC. 

NMR 

All  NMR  experiments  are  performed  using  a  500  MHz  (Varian  Unity)  NMR 
spectrometer. 

The  following  steps  are  performed  for  the  analysis  of  the  NMR  data. 

(i)  Sequential  Assignment 

At  first,  fingerprint  HN-H^  correlations  are  obtained  from  the  corresponding  TOCSY 
and  DQF-COSY  cross-sections  of  the  V3  loop  in  90%H2O+  10%D2O.  The  sidechains 
are  identified  from  HN-H“-H^  &  correlations  in  the  TOCSY  spectra 

and  H^-H^  correlations  in  the  DQF-COSY  spectra  of  the  V3  loop  in 

90%H20+  10%D20  and  in  100%D2O.  Once  isolated  spin-systems  belonging  to  con¬ 
stituent  amino  acids  are  identified  the  sequential  connectivity  is  obtained  by  mon- 
itoring  the  NOESY/ROESY  cross-peaks  for  H“(i)-HN(i-H),  H^(i)-HN(i-l-l),  & 
HN(i)-HN(i-l- 1)  interactions.  NOESY  data  are  collected  for  mixing  times  of  100  and 
400  ms.  ROESY  data  are  collected  for  mixing  times  of  60  and  200  ms. 

(ii)  Extraction  of  Inter-proton  Distances  as  Structural  Constraints 

This  step  involves  obtaining  an  energetically  stable  (S-S)-bridged  stmcture  for  the  Thailand 
TN243  loop  sequence  given  the  secondary  structural  states  of  the  constituent  amino 
acids  residues  as  obtained  by  analyzing  the  sequential  NOESY  and  ROESY  pattern. 
Appropriate  ranges  of  ((p,  \i/)  values  are  assigned  to  all  amino  acids.  For  example, 

(p  =  —55° ±25°,  xp  =  -55° ±25°  for  residues  in  a  helix; 

(p  =  -140° ±50°,  \|/  =  140°  ±50°  for  residues  in  a  beta  strand  or  in  an 

exended  conformation; 


cPi+i  =  -65°±20°,  \Pi+i  =  -50°±20° 

(Pj^2  =  -90° ±20°,  \|/i+2  =  0°±20°  for  residues  in  a  type-I  turn; 


(Pi+i  =  -65°±20°,  xpi+i  =  120°±20° 

^i+2  ~  90°±20°,  \|/j+2  =  0°±20°  for  residues  in  a  type-II  turn. 


(cp,  \|/)  of  residues  in  the  coil  state  are  set  free  to  chose  any  point  in  the  allowed  space  [for 
definitions  of  different  secondary  structures  and  corresponding  (cp,  \|/)-values,  reference 
10] .  First  we  obtain  (5'-5)-bridged  structure  for  a  pseudo  (CAjjQ-sequence  in  the  follow¬ 
ing  manner.  All  residues  with  sidechains  extending  beyond  atoms  are  treated  as  A 
except  the  Ps  &  Gs  and  the  terminal  Cs.  Our  rationale  for  doing  this  is  that  the  allowed  (cp, 
y)  space  of  residues  with  a  sidechain  longer  than  A  is  only  a  subspace  of  that  allowed  for 
A  [10].  We  obtain  an  (S-S)-bridged  structure  of  a  V3  loop  by  a  linked-atom-least- 
square  refinement  [11]  by  minimizing  a  function,  F,  only  in  the  (cp,  y)  space. 


348 


Gupta  et  at. 


F  =  R-Factor  +  Z,  X,  (dij™"  -  D”'^f - 


R-Factor  = 


Zllo-lc 

Zlo 


[1] 


lo  —  observed  NOESY  intensity  and  Ic  =  calculated  NOESY  intensity  by  full- 
maitrix  NOESY  simulations.  The  sum  extends  over  all  pairs  (i,j)  of  observed  NOESY 
cross-peaks. 

G,  (=lrj-r“|  I  =0)  indicates  distance  constraints  for  an  (S-S)  bridge.  Distances  in  the 
(S-S)-bridged  V3  loop  configuration  are  defined  as  ri=  S(C1)-S(35),  r2  =  C^(Cl)- 
S(C35),  rj  =  C^(C35)-S(C1)  and  r^  =  C*^(C1)-C^(C35);  corresponding  equilibrium 
distances  are  r°i  =  2.04A,  r°2  =  r°3  =  3.05A,  r%  =  3.85A  [12,13].  Xi  indicates 
Lagrangian  multipliers;  dij™"  indicates  the  distance  between  atom  i  (type  m)  and 
atom  j  (type  n);  and  Dmn  indicates  the  contact  limit  between  atom  (type  m)  and 
atom  (type  n)  [10].  In  this  refinement,  the  (cp,  \|/)  values  of  various  residues  are  treated 
as  elastic  variables  (i.e.,  variables  with  weights)  such  that  by  appropriate  choice  of 
weights  the  experimentally  determined  secondary  structural  states  of  residues  are 
minimally  altered  [14].  This  method  guarantees  a  stereochemically  orthodox  struc¬ 
ture  for  the  (S-S)-bridged  (CA33C)-like  sequence.  Finally  appropriate  sidechains 
are  attached  to  generate  an  actual  V3  loop  sequence  and  the  potential  energy  of  the 
system  is  minimized  in  the  (cp,  ip,  co,  x)-space  using  the  force-field  of  Scheraga  and  co¬ 
workers  [12].  The  minimization  of  the  function,  F,  for  the  virtual  (S-S)-bridged 
(CA33C)  system  followed  by  the  energy  minimization  for  the  actual  V3  loop  is 
repeated  for  50  different  starting  structures  (Fletcher,  1 984).  At  the  end  of  this  step  we 
obtain  a  set  of  50  models  in  agreement  with  the  NOESY  and  ROESY  data.  From 
these  models,  a  set  of  inter-proton  distances  are  extracted  as  structural  constraints 
required  for  agreement  with  the  NOESY/ROESY  data.  Each  pairwise  distance  rep¬ 
resenting  a  structural  constraint  provides  an  upper  and  a  lower  limit  of  the  distance. 
Two  types  of  constraints  are  identified. 

Type  I 

This  is  given  as 

EDIST=  0  if  the  distance  r  is  within  a  specified  range  rl  &  r2 

=  k(r-rl)^  if  r  <  rl 

=  k(r-r2)^  if  r  >  r2.  k  :  force  constant. 


Type  II 

This  is  given  as 

EDIST=  0ifr>rl 

=  k(r-rl)^  if  r  <  rl. 

This  type  is  particularly  useful  for  an  unobserved  NOE  where  we  can  set  a  lowest 
allowable  distance  limit  for  the  corresponding  proton  pair. 

(in)  Exploration  of  the  Conformational  Flexibility  Subject  to  the  NMR  Data 

The  energy  term,  EDIST,  is  added  to  the  force-field  as  in  Scheraga  and  co-workers 
[12;  QCEP  454].  The  simulated  annealing  is  performed  [15]  in  the  following  manner. 
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First,  a  starting  energy-minimized  structure  is  chosen  and  Monte  Carlo  (MC) 
simulations  are  performed  for  50,000  steps  at  600K  in  the  (cp,  \|/,  co,  x)'Space;  the  last 
accepted  configuration  is  stored  to  be  subsequently  used  as  a  starting  configuration 
in  the  next  lower  temperature-cycle.  Second,  50,000  MC  steps  are  repeated  in  several 
cycles  of  gradually  decreasing  temperature  until  a  temperature  of  lOOK  is  reached. 
Third,  the  lowest  energy  configuration  at  lOOK  is  further  energy-minimized  to  a  low 
energy  gradient.  This  is  the  "temperature  quenching"  step  in  which  thermally 
excited  single  bond  rotations  around  the  equilibrium  positions  are  quenched. 
Finally,  first  through  third  steps  are  repeated  for  100  different  starting  con¬ 
figurations  which  comprise  of  the  50  starting  structures  obtained  after  the  Step  (ii) 
discussed  in  the  previous  paragraph  and  50  other  structures  as  obtained  after  carry¬ 
ing  out  first  through  third  steps  of  the  simulated  annealing  protocol  on  the  very 
same  50  starting  structures  as  discussed  in  the  previous  paragraph. 

(iv)  Analyses  of  the  V3  Loop  Structures 

Low  energy  structures  are  analyzed  in  terms  of  their  (cp,  \\f,  co,  %),  agreement  with  the 
NMR  data  and  agreement  with  the  cp-values  as  estimated  from  the  HN-H“  J- 
coupling  data  obtained  from  the  DQF-COSY  experiments.  For  Jhn-h"  >  Hz  *e 
stipulated  constraint  on  cp  is  1 10°  ±40°.  However,  no  cp-constraint  is  imposed  during 
simulated  annealing;  but  in  the  final  model  if  cp  falls  far  outside  the  expected  range 
(as  determined  from  the  DQF-COSY  data  in  Figures  3  &  7),  then  the  corresponding 
model  is  discarded. 

Results 

NMR  Data 

Figure  1  shows  the  amino  acid  sequence  of  the  ChiangMai  V3  loop  sequence.  The  N- 
terminal  modified  Cl  is  added  for  facilitating  conjugation  in  antibody  production 
experiments.  Note  that  C2  and  C36  are  involved  in  (S-S)  bridge  formation.  A  total  of 
35  HN  protons  are  present  in  the  system,  i.e.,  excluding  two  Ps  in  the  sequence  and 
including  the  terminal  blocking  group  at  Cl.  Figure  2  shows  the  TOCSY  HN-H“ 
fingerprint  region  of  the  Thailand  TN243  V3  loop  at  10°C.  All  the  spin  systems  are 
observed;  the  only  exception  is  the  HN  of  the  blocking  group.  The  secjuential  assign¬ 
ment  of  all  the  cross-peaks  are  indicated  in  Figure  2.  The  amino  acid  residues  that 
occur  only  once  in  the  sequence  are  easily  assigned,  e.g.,  Q 1 9,  V20,  F2 1 ,  K33,  &  A34. 
The  spin  systems  of  G  and  S  are  distinguished  by  examining  the  DQF-COSY  H“  - 
H“^  &  H“-H^  cross-peak  patterns.  The  spin  system  of  the  branched  sidechains  of  T 
and  I  are  identified  by  monitoring  the  H“-H^-H^  TOCSY  and  DQF-COSY  coupling 
patterns.  The  spin  system  of  R  is  identified  by  monitoring  the  H“-H'^-H^  J-coupling 
pattern  and  H^-H®-NH  NOE  pattern.  The  spin  system  of  Y  is  assigned  by  monitor¬ 
ing  the  H“-H^  &  2,6H-3,5H  J-coupling  patterns  and  the  H*^-Hring  NOE  pattern.  The 
spin  systems  of  D  and  N  are  distinguished  by  the  presence  of  the  NH2  group.  Figure 
3  displays  the  fingerprint  region  of  a  DQF-COSY  spectrum  of  the  Thailand  TN243 
V3  loop  at  10°C  showing  the  HN-H“  cross-peaks.  Positive  and  negative  contour  are 
plotted  without  distinction.  Here  the  shapes  of  the  peaks  are  distorted  because  both 
sine-bell  and  shifted  sine-bell  apodization  functions  are  used  to  sharpen  the  peaks. 
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Chiang  Mai  U3  Loop 


^  NH 

^c.o 

CH,"^ 

Figure  1 :  The  amino  acid  (aa)  sequence  of  the  Thailand  TN243  V3  loop.  C2  and  C36  form  a  (S-S)  bridge.  A  mod¬ 
ified  Cl  is  present  at  the  N-temunal.  The  protecting  group  on  the  N-terminal  is  removed  before  conjugation  to 
BSA-  a  step  involved  in  antibody  production.  Amino  acid  containing  the  GPG-crest  and  the  two  flanking  regions 
(extending  up  to  4-5  aa  on  each  side)  define  the  immunogenic  tip  of  the  V3  loop.  N7  (within  the  recognition  element 
N7-N8-T9)  is  the  site  of  N-linked  glycosylation.  Although  epitope  mapping  for  this  particular  V3  loop  is  not  repor¬ 
ted,  amino  acids  within  the  aal3-27  stretch  of  the  V3  loop  generally  make  contact  with  the  antibody  [2223]. 
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Figure  2:  HN-H“  fingerprint  TOCSY  cross>section.  The  TOCSY  experiment  is  performed  under  the 
following  conditions:  temperature  ==  10  °C,  pH  =  4.5,  polypetide  concentration  =  8  mM,  solvent  com¬ 
position  =  90%H20+  10%D20.  The  acquisition  parameters:  data  points  in  t2  =  2048;  complex  data  points 
in  tl  =  256;  relaxation  delay;  RD  =  1.5  s;  mixing  time  =  35  ms.  The  HDO  signal  is  suppressed  by  employ- 
iiig  the  pulse  sequence  of  Sklenar  and  Bax  [15].  Sevaral  contours  are  drawn  starting  from  the  lower  level 
of  intensity  in  order  to  show  all  the  (HN-H“)  crosspeaks. 

However,  for  accurate  measurement  of  the  J(H^-HN)  couplings  (see  Figure  7  below), 
appropriate  apodization  functions  and  line-width  corrections  are  used  to  make  the  two 
extrema  as  symmetric  to  each  other  as  possible  [17] .  Out  of  the  34  possible  (HN,H“)  spin 
systems,  J(HN-H“)  coupling  data  are  obtained  for  27  residues.  No  DQF-COSY  HN-H“ 
cross-peaks  are  obtained  for  Cl,  G 18,  &  G29.  Overlap  of  the  HN-H“  cross-peaks  for 
K33  &  A34  precluded  measurements  of  the  corresponding  J  couplings.  The  HN-H"" 
cross-peak  of  T3  is  too  weak  and  poorly  defined  to  allow  measurement  of  J  coupling.  A 
weak  HN-H“  cross-peak  is  observed  for  only  one  of  the  G16  protons. 

Even  though  isolated  spin  systems  of  all  the  amino  acids  are  identified  from  the 
TOCSY  and  the  DQF-COSY  experiments,  the  sequential  assignment  is  possible 
only  after  examining  the  inter-residue  H^(i)-HN(i+ 1)  and  H^(i)-HN(i+ 1)  NOESY 
connectivities  as  shown  in  Figures  4  and  5,  respectively. 


352 


Gupta  et  al. 


FI  (ppm) 

Figure  3.  HN-H  fingerprint  DQF-COSY  cross-section*  The  DQF-COSY  experiment  is  performed 
under  the  following  conditions:  temperature  =  10  °C,  pH  =  4.5,  polypetide  concentration  =  8  mM,  solvent 
composition  =  90%H2O+  10%D20.  The  acquisition  parameters:  data  points  in  t2  =  2048;  tl  increments  = 
256;  relaxation  delay;  RD  =  1.5  s.  The  HDO  signal  is  presaturated  for  1  s  during  the  relaxation  delay. 
Fourier  transformation  is  performed  on  a  (2048X1024)  data  matrix  with  a  combination  of  gaussian,  sin- 
bell,  and  shifted  sine-bell  apodization  functions. 


Figure  4  shows  the  H“-HN  NOESY  cross-section  (mixing  time  =  400  ms)  of  the 
Thailand  TN243  V3  loop  at  10°C.  Note  that  a  NOESY  cross-peak  corresponding  to 
H^-HN  interaction  of  the  N-term  is  observed  in  this  cross-section.  A  few  intra-  and 
inter-residue  H^-HN  interactions  are  also  observed  in  this  cross-section.  The  NOE 
pattern  is  consistent  with  the  presence  of  three  turns  located  at  R4-P5-S6-N7,  G16- 
P  i  7-G 1 8-Q 1 9,  and  T24-G25-D26-I27 .  The  sequential  H'^-HN  NOEs  and  distinctive 
long  range  NOEs  for  HN(C2)-H“(S6)  &  HN(C2)-H^(S6)  support  the  presence  of  a 
type-I  turn  at  R4-P5-S6-N7.  The  sequential  H“(P17)-HN(G18)  overlaps  with  the 
intra-residue  H“-HN  of  T9;  however,  other  sequential  H“-HN  in  the  G16-P17-G18- 
Q19  is  consistent  with  a  type-II  turn  vdth  a  single  H-bond  between  C=0(G16)  & 
HN(Q19).  In  view  of  the  fact  that  a  strong  H“(P17)-HN(G18)  NOE  distinguishes  a 
type-II  from  a  type-I  turn  [18],  the  overlap  of  this  cross-peak  in  the  spectrum  (Figure 
4)  prevents  us  from  unequivocally  suggesting  a  G16-P17-G18-Q19  type-II  turn. 
However,  in  other  V3  loop  sequences  with  GPGR-sequence,  a  type-II  turn  is  indicated 
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Figure  4:  HN-H°  NOESY  cross-section.  The  NOESY  experiments  are  performed  under  the  following 
conditions:  temperature  =  10  °C;  pH  =  4.5,  polypeptide  concentration  =  8  mM;  solvent  composition  = 
W/oHjO-l-  lOToDjO.  The  acquisition  parameters:  data  points  in  t2  =  2048,  complex  data  points  in  1 1  =  256; 
relaxation  delay,  RD  =  1.5  s;  mixing  time  =  400  ms.  The  HDO  signal  is  suppressed  by  presaturation  (for  1 
s)  during  the  relaxation  delay. 

[19,20].  In  addition,  simulated  annealing  of  the  GPGQ-fragment  shows  the  energetic 
preference  oftype-II  over  type-I  turn.  Therefore,  we  considered  a  type-II  turn  for  the 
GPGQ-sequence  of  the  Chiang  Mai  V3  loop.  The  sequential  H“-HN  NOE  pattern 
and  weak  H“(G25)-HN(27)  and  HN(T24)-H“(I27)  NOEs  suggest  a  type-I  turn  for 
the  T24-G25-D26-I27  sequence.  An  extended  conformation  (which  includes  the 
single  p-strand)  is  identified  for  the  I28-G29-D30-I31  sequence  because  of  the  pre¬ 
sence  of  mostly  strong  sequential  H“-HN  NOESY  cross-peaks  in  Figure  4.  An 
extended  conformation  is  also  located  at  the  T11-S12-I13-T14-I15  sequence;  residues 
in  this  segment  show  sequential  H“-HN  NOESY  cross-peaks  of  medium  intensity 
(see  Figure  7  below  for  the  summary  of  NOE  and  ROE  data).  The  centers  of  H“-HN 
NOESY  cross-peaks  for  the  T11-S12-I13-T14-I15  sequence  are  not  clearly  defined 
in  Figure  4.  However,  the  centers  of  H°-HN  NOESY  cross-peaks  for  T1 1  -S 12  &  S 12- 
113  are  distinguished  by  examining  the  NOESY  cross-section  at  100  ms  mixing  time 
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Table  I 

Chemical  shift  values  in  ppm.  The  assignment  is  obtained  by  analyzing  TOCSY,  DQF-COSY,  & 
NOESY  data  of  the  Thailand  TN243  V3  loop  in  90%H2O+  lO^/oDjO  and  TOCSYdata  in  lOOroD^O.  TSP  is 
used  as  an  internal  standard.  Note  that  the  complete  spin  system  of  R4,  RIO  and  R23  beyond  yCH/pCH 
could  not  be  assigned.  The  sidechain  NHs  of  these  residues  appear  as  broad  peaks  at  (6.52,6.81  ppm). 


Residue 

HN 

Other 

Cl 

8.29 

4.59 

3.12,3.10 

C2 

8.23 

4.31 

3.19,3.00 

T3 

8.28 

4.36 

4.24 

yCH3  1.18 

R4 

8.46 

4.40 

1.82,1.82 

yCH2  1.59 

P5 

- 

4.40 

2.20,2.20 

yCH2  1.96,1.96;  8CH2  3.75,3.56 

S6 

8.35 

4.50 

3.83,3.83 

N7 

8.55 

4.70 

2.78,2.78 

NH  6.94,7.62 

N8 

8.49 

4.73 

2.82,2.76 

NH  6.96,7.64 

T9 

8.63 

4.39 

4.24 

yCH3  1.18 

RIO 

8.37 

4.28 

1.92,1.92 

yCH2  1.62 

Til 

8.20 

4.37 

4.28 

yCH3  1.18 

S12 

8.51 

4.40 

3.89,3.84 

113 

8.20 

4.14 

1.82 

yCH2  1.43,1.20;  y  &  6  CH3  0.85 

T14 

8.32 

4.36 

4.11 

yCH3  1.12 

115 

8.28 

4.08 

1.80 

yCH2  1.43,1.20;  y  &  8  CH3  0.85 

G16 

8.34 

4.16,4.03 

P17 

- 

4.41 

2.24,2.24 

yCH2  2.00,2.00;  8CH2  3.75,3.56 

G18 

8.64 

3.92,3.94 

Q19 

8.07 

4.28 

2.12,1.90 

V20 

8.10 

4.02 

1.81 

yCH3  0.81,0.70 

F21 

8.33 

4,62 

2.88,2.88 

2,6H  7.14;  3,5H  7.24 

Y22 

8.50 

4.57 

2.93,2.97 

2,6  H  6.78;  3,5,4H  7.09 

R23 

8.63 

4.26 

1.98,1.98 

T24 

8.56 

4.37 

4.24 

yCH3  1.18 

G25 

8.46 

3.92,3.92 

D26 

8.28 

4.68 

2.82,2.79 

127 

8.18 

4.12 

1.85 

yCH2  1.43,1.20;  y  &  8  CH3  0.85 

128 

8.30 

4.22 

1.63 

yCH2  1.43,1.20;  y  &  8  CH3  0.85 

G29 

8.51 

4.01,3.89 

D30 

8.32 

4.69 

2.84,2.76 

131 

8.33 

4.28 

1.82 

yCH2  1.43,1.20;  y  &  8  CH3  0.85 

R32 

8.43 

4.59 

1.84 

yCH2  1.62;  8CH2  3.10;  NH  7.24,7.30 

K33 

8.22 

4.24 

1.84,1.84 

yCH2  1.36;  8CH2  1.65;  8CH2  2.95;  NH  7.54 

A34 

8.24 

4.26 

1.26 

Y35 

8.22 

4.31 

2.98,2.87 

2,6H  6.74;  3,5H  7.05 

C36 

8.18 

4.31 

2.82,2.90 

and  the  ROESY  cross-section  at  60  ms  mixing  time  (data  not  shown).  The  inter¬ 
residue  H  -HN  NOESY  cross-peaks  for  I13-T14  &  T14-I15  are  always  partially 
overlapping  in  all  NOESY  and  ROESY  cross-sections.  The  assignments  of  (H“,HN) 
of  C2,  T3,  &  R4  do  not  readily  folllow  from  the  connectivity  route  because  H“  of  C2, 
T3,  &  R4  are  not  sufficiently  resolved  in  chemical  shift  so  as  to  allow  construction  of 
a  clear  H“(i)-HN(i-t-l)  connectivity.  Because  there  areother  possible  H“(i)-HN(i+l) 
connectivities  involving  T-R/R-T  sequences,  we  first  constructed  the  El“(i)-HN(i+l) 
connectivity  route  for  the  relatively  non-overlapping  regions  of  the  spectrum.  We 
deferred  the  assignment  of  C2,  T3,  &  R4  till  the  end  when  C2  is  distingushed  from 
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Figure  5:  HN-H^  NOESY  (mixing  time  =  400  ms)  cross-section.  The  experimental  and  solutions  con¬ 
ditions  are  the  same  as  in  Figure  4. 


C36  by  the  presence  of  C2-S6  interactions  and  T3  and  R4  are  easily  distingushed  in 
terms  of  their  intra-residue  TOCSY/NOESY  HN-H^  spin  correlation.  Note  that  the 
centers  of  the  (H“-HN)  NOESY  cross-peaks  (Figure  4)  are  not  as  clearly  defined  as 
the  corresponding  TOCSY (Figure  2)  or  DQF-COSY (Figure  3)  cross-peaks.  Therefore, 
there  are  mismatches  between  the  the  centers  of  TOCSY  cross-peak  and  the  apparent 
centers  of  the  corresponding  NOESY  cross-peaks.  We,  however,  have  been  consis¬ 
tent  by  adhering  to  the  peak  centers  as  determined  from  the  TOCSY  (Figure  2)  or 
DQF-COSY  (Figure  3)  cross-sections  and  therefore,  lines  in  the  NOESY  cross- 
section  (Figure  4)  appear  not  to  exactly  pass  through  the  apparent  centers  of  two 
(H“-HN)  cross-peaks.  Chemical  shift  values  are  given  in  Table  I  correspond  to  the 
centers  of  TOCY  peaks. 


Figure  5  shows  the  H^-HN  NOESY  cross-section  (mixing  time  =  400  ms)  of  the 
Thailand  TN243  V3  loop  at  10”C.  Note  the  presence  of  sequential  H'^(i)-HN(i+ 1) 
NOESY  connectivity  for  the  residues  in  the  C-terminal  13 1-R32-K33-A34-Y35  seg¬ 
ment.  Such  an  NOE  pattern  is  expected  for  residues  in  an  a-helical  segment. 
However,  the  absence  of  sequential  HN-HN  NOESY  cross-peaks  for  all  the  residues 
in  the  segment  (Figure  6)  suggests  that  the  residues  in  this  segment  are  only  partially 
folded  and  do  not  form  a  regular  rigid  helix  (see  Figures  8  &  9  below).  The  R32  shows 
the  intra-residue  HN-H*  NOE  and  the  inter-residue  H^(I31)-HN(R32)  NOE.  Inter¬ 
residue  NOE  is  also  observed  for  the  sequential  H^(Q19)-HN(V20)  interaction.  The 
NOEs  involving  H^^(P5)-HN(N7)  are  consistent  with  a  turn  at  the  T4-P5-S6-N7 
sequence.  The  strong  and  medium  NOEs  in  Figures  4  and  5  are  also  present  in  the 
corresponding  NOESY  cross-sections  at  100  ms  of  mixing.  However,  the  sequential 
HN-HN  NOESY  cross-peaks  are  only  observed  at  400  ms  of  mixing.  Figure  6  shows 
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Figure  6:  HN-HN  NOESY  (mixing  time  =  400  ms)  cross-section.  The  experimental  and  solutions  con¬ 
ditions  are  the  same  as  in  Figure  4. 


this  cross-section.  The  cross-peak  for  HN(G18)-HN(Q19)  interaction  is  consistent 
with  a  turn  at  G16-P17-G18-Q19.  In  the  partially  folded  segment,  the  cross-peak 
HN(R32)-HN(K33)  is  clearly  visible;  the  cross-peaks  due  to  HN(K33)-HN(A34)  & 
HN(A34)-HN(Y35)  are  too  close  to  the  diagonal  to  be  clearly  visible.  Considering  the  fact 
that  strong  NOESY  cross-peaks  are  obtained  for  H^(K33)-HN(A34)  &  H^(A34)-HN(Y35) 
interactions  even  at  100  ms  mixing  time,  medium  NOESY  intensities  with  soft  force 
constants  (k  =  IKCal/Mole)  are  assumed  for  these  two  distances.  The  cross-peak  for 
HN(G25)-HN(D26)  is  consistent  with  type-I  turn  T24-G25-D26-I27. 

Solution  NMR  studies  were  previously  reported  for  two  other  types  of  the  V3  loop 
sequences  [19,20].  As  in  the  present  case,  the  structure  determination  from  the  2D 
NMR  data  in  the  previous  two  cases,  was  complicated  by  poor  resolution  and  peak 
overlap  in  the  NOESY  spectrum.  As  also  noted  in  the  previous  studies,  the  com¬ 
plications  in  the  NMR  spectrum  were  primarily  due  to  the  flexible  nature  of  the  V3 
loops.  Although  some  local  structures  are  formed  mostly  in  the  form  of  putative 
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turns  [19,20],  the  overall  folding  of  the  two  previously  reported  V3  loops  were  rather 
loose.  In  addition,  the  simulataneous  observation  of  the  sequential  H“-HN  (typical 
for  a  an  extended  or  a  beta  conformation)  and  the  sequential  HN-HN  (typical  for  an 
alpha  helical  conformation)  NOESY  crosspeaks  indicated  that  perhaps  the  V3  loop 
(especially  for  the  HIV-MN  isolate,  reference  19)  was  sampling  two  widely  different 
conformations.  However,  in  the  case  of  the  Thailand  TN243  V3  loop,  only  weak 
sequential  HN-HN  NOESY  crosspeaks  are  observed  for  a  few  residues  [except  for 
NH(G18)-NH(Q19)  which  is  of  medium  intensity);  whereas  the  sequential  H“-HN 
crosspeaks  for  the  residues  assigned  in  the  extended  or  beta  conformation  are  quite 
strong.  This  coupled  with  the  (H“-HN)  J-coupling  data  (Figure  7)  suggest  that  most 
of  the  residues  (though  flexible)  are,  indeed,  in  the  extended  or  beta  conformation. 

As  discussed  in  the  following  section,  here  we  have  made  an  attempt  to  visualize  the 
nature  of  the  conformational  flexibility  of  a  particular  V3  loop  from  a  Thailand  HIV 
isolate.  The  flexibility  is  analyzed  subject  to  the  NMR  data  which  shows  the  pre¬ 
sence  of  three  localized  turns  and  two  stretches  of  extended  conformation. 

Structure  and  Flexibility 

Figure  7  summarizes  the  results  of  NMR  experiments  [18].  The  majority  of  the 
observed  NOEs  are  due  to  intra-residue  and  nearest-neighbor  inter-residue  interactions. 
Seven  observed  inter-residue  NOEs  are  due  to  long-range  interactions,  i.e.,  pair¬ 
wise  interactions  involving  residues  that  are  not  nearest  neighbors.  The  interaction 
involving  P5  and  N7  suggests  a  type-I  turn  at  the  site  of  glycosylation  R4-P5-S6-N7* 
(N7*  being  the  locus  of  N-linked  glycosylation).  Similarly,  the  interaction  involving 
T24  and  127  suggests  a  type-I  turn  at  T24-G25-D26-I27.  In  addition,  a  type-II  turn  at 
G16-P-17-G18-Q19  and  two  p-strands/extended  conformations  flanking  this  turn 
are  indicated  on  the  basis  of  intra-residue  and  nearest-neighbor  inter-residue 
NOEs.  The  secondary  structural  features  mentioned  above  and  the  constraints  of 
the  (S-S)  bridge  between  C2  and  C36  (Figure  1)  are  used  in  Step  (ii)  of  Materials  and 
Methods  to  obtain  various  models  in  agreement  with  the  NOESY/ROESY  data. 
From  these  models  a  set  of  100  inter-residue  distances  are  extracted  as  structural 
constraints  that  are  crucial  for  agreement  with  the  NMR  data.  Observation  of 
almost  all  intra-residue  H"-HN  NOESY  cross-peaks  suggests  that  the  effective  tum¬ 
bling  time  of  the  individual  peptide  units  in  the  V3  loop  is  long  enough  so  that  we  are 
able  to  capture  distances  less  than  3  Ain  the  laboratory  frame  NOESY  experiment. 
However,  the  inherent  flexibility  of  the  link  between  two  neighboring  peptides 
allows  a  whole  range  of  dynamics  for  the  sequential  interactions.  The  absence  of  a 
given  sequential  NOESY  cross-peak  does  not  necessarily  imply  that  the  corres¬ 
ponding  distance  is  beyond  a  certain  limit.  The  local  dynamics  of  the  link  between 
two  neighboring  peptide  units  may  reduce  the  effective  correlation  time  and  thereby 
place  (OT^  in  the  intermediate  range  and  elude  the  observation  of  corresponding 
NOESY  peaks  in  the  laboratory  frame  experiments.  We,  therefore,  supplemented 
2D  NOESY  (100  &  400  ms  of  mixing  time)  with  the  ROESY  (60  &  200  ms  of  mixing 
time)  data.  This  allowed  us  to  observe  inter-proton  distances  below  a  limit  regard¬ 
less  of  their  effective  correlation  time.  Full  Relaxation-Matrix  analyses  allowed  us 
to  stipulate  two  types  of  constraints:  (1)  upper  and  lower  distance  limits  for  the 
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Figure  7:  Summary  of  the  DQF-COSY,  the  NOESY  data  at  mixing  times  of  100  &  400  ms,  and  the  ROESY  data  at  mixing  times  of  60  and  200  ms  reveal  the 
shown  interaction  pattern  for  intra-  and  inter-residue  pair-wise  contacts  in  the  Chiang  Mai  V3  loop. 
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observed  NOESY/ROESY  peaks  and  (2)  the  lower  distance  limits  for  the  unobser¬ 
ved  NOESY/ROESY  peaks.  These  constraints  are  discussed  in  ihc  Materials  Methods 
and  in  Table  II  of  the  supplementary  material.  Only  inter-residue  distances  are  used 
as  distance  constraints.  The  force  constants,  k,  for  the  distance  constraints  corres¬ 
ponding  to  the  observed  peaks  are  set  at  5  KCal/Mole.  During  MC  simulated 
annealing  and  energy  minimization  if  these  constraints  are  violated  by  more  than  1 
A,  then  k  is  increased  to  50  KCal/Mole  to  bring  the  distances  close  to  the  constrained 
values.  The  force  constant,  k,  for  the  unobserved  distances  are  set  at  0.5  Kcal/Mole 
so  that  the  conformational  search  is  not  biased  by  these  sets  of  constraints.  A  soft 
force  constant,  k,  of  1  KCal/Mole  is  used  for  the  distances  that  correspond  to  the 
observed  (i)  and  (i+2/3/4)  NOEs/ROEs  (that  are  weak  in  intensity  even  at  400  ms 
mixing  time). 

MC  simulated  annealing,  followed  by  energy  minimization  subject  to  100  distance 
constraints,  led  to  about  100  low  energy  structures  in  agreement  with  the  NMR  data 
[see  step  (Hi)  of  Materials  and  Methods].  Out  of  these,  37  models  with  lowest  energies 
and  rms  deviations  of  0.40+10  A  with  respect  to  the  100  distance  constraints  are 
chosen  as  structural  solutions.  This  rms  deviation  is  on  top  of  the  allowed  distance 
range  stipulated  for  agreement  with  the  NMR  data.  Hypothetically,  if  this  deviation 
were  zero  then  we  would  have  a  perfect  agreement  with  the  NMR  data.  Although 
several  starting  structures  are  chosen,  the  simulated  annealing  protocol  used  in  this 
work  is  by  no  means  an  exhaustive  search  in  the  conformational  space  of  the  V3 
loop.  This  method  allows  us  to  search  for  the  local  minima  on  the  sampled  space  of 
the  V3  loop.  The  sampling  is  guided  by  the  force-field  and  the  set  of  inter-residue 
distance  constraints  that  satisfy  the  NMR  data.  The  inherent  nature  of  the  flexibility 
of  this  V3  loop  is  visualized  by  making  the  following  observations.  Although  in  the 
set  of  energy-minimized  structures  the  standard  deviations  in  the  (cp,  \|/)-values  are 
within  30°,  the  average  rms  deviations  in  the  spatial  position  of  the  C“-atoms  among 
37  structures  (Figure  8B)  are  as  large  as  1.5  A.  This  shows  that  even  small  changes  in 
(cp,  \|/)-values  when  appropriately  correlated  causes  a  large  overall  change.  Data  pre¬ 
sented  here  shows  the  nature  of  the  conformational  flexibility  in  the  ChiangMai  V3 
loop  arising  due  mainly  to  correlated  motions  around  the  single  bond  torsions,  i.e., 
peptide  backbone  angles  (cp,  \|/),  with  little  cost  of  energy.  The  extent  of  flexibility 
shown  here  merely  reflects  the  lowest  possible  values  because  the  thermal  motions 
are  filtered  out  after  energy  minimization. 

Figure  8A  shows  the  average  folding  pattern  of  the  V3  loop;  different  secondary 
structural  elements  are  color  coded.  Although  most  of  the  residues  are  in  the  beta/ 
extended  state  (shown  in  green),  the  constituent  secondary  structural  elements  are 
so  assembled  that  in  all  37  structures  the  GPG-crest  at  the  center  of  the  V3  loop 
forms  a  protruding  surface  with  Q19  exposed  to  the  environment.  Also  the  turn  at 
R4-P5-S6-N7*  exposes  N7*  to  the  environment  such  that  N7*  is  accessible  for 
glycosylation  (Figure  8B).  It  may  be  noted  that  the  amino  acid  site  of  glycosylation 
often  forms  a  beta-turn  and  resides  on  the  exposed  surface  of  the  protein  [21]. 

As  shown  in  Figure  9,  the  extent  of  the  flexibility  of  the  Chiang  Mai  V3  loop  is  best 
described  by  analyzing  the  average  and  the  corresponding  standard  deviations  of 
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Table  II 

Inter-residue  distance  constraints  for  101  proton-pairs.  The  constraints  belong  to  two  categories  :(1)  con¬ 
straints  with  specified  ranges;  (2)  constraints  specifying  that  the  target  distances  should  be  above  the 
specified  upper  limits.  DIST  =  average  distance  for  a  proton-pair;  standard  deviation  is  shown  inside  the 
parenthesis.  DEQ 1  and  DEQ  are  the  lower  and  upper  distance  limits.  DEV  =  average  deviation  from  the 
specified  constraint.  Note  that  only  about  10  constraints  are  violated  by  more  than  1  A. 


ATOMl 

RESl 

ATOM2 

RES2 

DIST 

DEQl 

DEQ 

DEV 

HA 

CYS2 

HN 

THR3 

3.66  (.02) 

4.00 

5.00 

.34 

HN 

CYS2 

HA 

SER6 

4.41  (.17) 

3.00 

4.00 

.41 

HN 

CYS2 

HB 

SER6 

S.04  (.19) 

4.00 

5.00 

.10 

HA 

THR3 

HN 

ARG4 

3.71  (.OS) 

3.00 

4.00 

.00 

HA 

PROS 

HN 

SER6 

2.27  (.08) 

4.00 

5.00 

1.73 

HA 

SER6 

HN 

ASN7 

2.68  (.07) 

3.50 

4.50 

.82 

HA 

ASN7 

HN 

ASN8 

2.60  (.07) 

2.50 

3.00 

.00 

HA 

ASN8 

HN 

THR9 

2.64  (.08) 

4.50 

5.50 

1.86 

HA 

THR9 

HN 

ARGIO 

2.24  (.13) 

5.00 

5.00 

2.76 

HA 

ARGIO 

HN 

THRU 

2.3S  (.04) 

2.50 

3.00 

.15 

HA 

THRU 

HN 

SER12 

2.52  (.07) 

2.50 

3.00 

.02 

HA 

SER12 

HN 

IS013 

2.34  (.07) 

2.50 

3.00 

.16 

HA 

IS013 

HN 

THR14 

2.19  (.01) 

2.50 

3.00 

.31 

HA 

THR14 

HN 

ISOIS 

2.82  (.05) 

2.50 

3.00 

.00 

HA 

IS015 

HN 

GLY16 

2.18  (.01) 

3.50 

4.50 

1.32 

HA 

PRO  17 

HN 

GLY18 

2.49  (.07) 

2.00 

2.50 

.02 

HA 

GLY18 

HN 

GLN19 

3.49  (.03) 

3.00 

3.75 

.00 

HA 

GLN19 

HN 

VAL20 

3.42  (.05) 

3.00 

3.75 

.00 

HA 

VAL20 

HN 

PHE21 

2.46  (.05) 

2.20 

2.70 

.00 

HA 

PHE21 

HN 

TYR22 

2.30  (.11) 

4.00 

5.00 

1.70 

HA 

TYR22 

HN 

ARG23 

2.24  (.05) 

4.50 

5.50 

2.26 

HA 

ARG23 

HN 

THR24 

2.30  (.05) 

3.50 

4.50 

1.20 

HA 

THR24 

HN 

GLY2S 

2.21  (.02) 

2.95 

2.95 

.74 

HA 

GLY25 

HN 

ASP26 

2.83  (.28) 

2.20 

2.70 

.19 

HA 

ASP26 

HN 

IS027 

2.47  (.04) 

2.20 

2.70 

.00 

HA 

IS027 

HN 

IS028 

2.27  (.02) 

2.20 

2.70 

.00 

HA 

IS028 

HN 

GLY29 

2.31  (.03) 

2.20 

2.70 

.00 

HA 

GLY29 

HN 

ASP30 

2.74  (.21) 

2.50 

3.00 

.01 

HA 

ASP30 

HN 

IS031 

2.39  (.04) 

2.50 

3.00 

.11 

HA 

IS031 

HN 

ARG32 

2.52  (.04) 

2.50 

3.00 

.01 

HA 

ARG32 

HN 

LYS33 

2.95  (.27) 

4.00 

4.00 

1.05 

HA 

LYS33 

HN 

ALA34 

2.93  (.16) 

4.00 

4.00 

1.07 

HA 

ALA34 

HN 

TYR3S 

3.17  (.16) 

4.00 

4.00 

.83 

HA 

TYR35 

HN 

CYS36 

2.94  (.07) 

4.00 

4.00 

1.06 

HB 

CYS2 

HN 

THR3 

3.78  (.15) 

4.50 

4.50 

.72 

HB 

THR3 

HN 

ARG4 

3.06  (.20) 

2.50 

3.50 

.00 

HB 

PROS 

HN 

SER6 

4.21  (.17) 

4.50 

4.50 

.29 

HB 

SER6 

HN 

ASN7 

3.64  (.10) 

4.50 

4.50 

.86 

HB 

ASN7 

HN 

ASN8 

4.73  (.01) 

4.50 

4.50 

.23 

HB 

ASN8 

HN 

THR9 

3.76  (.09) 

4.50 

4.50 

.74 

HB 

THR9 

HN 

ARGIO 

4.21  (.36) 

4.50 

4.50 

.29 

HB 

ARGIO 

HN 

THRU 

4.31  (.05) 

4.50 

4.50 

.19 

HB 

THRU 

HN 

SER12 

3.91  (.07) 

4.50 

4.50 

.59 

HB 

SER12 

HN 

IS013 

4.30  (.10) 

4.50 

4.50 

.20 

HB 

IS013 

HN 

THR14 

4.35  (.05) 

4.50 

4.50 

.15 

HB 

THR14 

HN 

ISOIS 

3.55  (.05) 

4.50 

4.50 

.95 

HB 

ISOIS 

HN 

GLY16 

4.33  (.02) 

4.50 

4.50 

.17 

HB 

GLN19 

HN 

VAL20 

4.17  (.07) 

4.00 

5.00 

.00 

HB 

VAL20 

HN 

PHE21 

4.00  (.07) 

4.50 

4.50 

.50 

HB 

PHE21 

HN 

TYR22 

4.20  (.12) 

4.50 

4.50 

.30 

HB 

TYR22 

HN 

ARG23 

4.25  (.14) 

4.50 

4.50 

.25 

HB 

ARG23 

HN 

THR24 

4.43  (.12) 

4.50 

4.50 

.10 

HB 

THR24 

HN 

GLY2S 

4.57  (.02) 

4.50 

4.50 

.07 

HB 

ASP26 

HN 

IS027 

3.76  (.08) 

4.50 

4.50 

.74 
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Table  II  continued 


ATOMl 

RESl 

ATOM2 

RES2 

DIST 

DEQl 

DEQ 

DEV 

HB 

IS027 

HN 

IS028 

4.55  (.06) 

4.50 

4.50 

.06 

HB 

IS028 

HN 

GLY29 

3.84  (.05) 

2.80 

3.50 

.34 

HB 

ASP30 

HN 

IS031 

2.85  (.15) 

2.80 

3.50 

.04 

HB 

IS031 

HN 

ARG32 

3.88  (.06) 

2.80 

3.50 

.38 

HB 

ARG32 

HN 

LYS33 

4.10  (.34) 

2.50 

3.25 

.85 

HB 

LYS33 

HN 

ALA34 

2.21  (.06) 

2.50 

3.25 

.29 

HB 

ALA34 

HN 

TYR35 

4.06  (.29) 

2.50 

3.25 

.81 

HB 

TYR35 

HN 

CYS36 

3.60  (.11) 

4.50 

5.50 

.90 

HN 

CYS2 

HN 

THR3 

3.57  (.15) 

4.50 

4.50 

.93 

HN 

THR3 

HN 

ARG4 

2.94  (.26) 

2.50 

3.50 

.02 

HN 

SER6 

HN 

ASN7 

4.37  (.03) 

4.50 

4.50 

.13 

HN 

ASN7 

HN 

ASN8 

3.65  (.11) 

4.50 

4.50 

.85 

HN 

ASN8 

HN 

THR9 

4.35  (.01) 

4.50 

4.50 

.15 

HN 

THR9 

HN 

ARGIO 

4.46  (.09) 

4.50 

4.50 

.09 

HN 

ARGIO 

HN 

THRU 

4.27  (.05) 

4.50 

4.50 

.23 

HN 

THRU 

HN 

SER12 

4.41  (.01) 

4.50 

4.50 

.09 

HN 

SER12 

HN 

IS013 

4.16  (.13) 

3.00 

3.50 

.66 

HN 

IS013 

HN 

THR14 

3.93  (.14) 

4.00 

4.55 

.08 

HN 

THR14 

HN 

IS015 

4.32  (.02) 

4.50 

4.50 

.18 

HN 

IS015 

HN 

GLY16 

4.14  (.05) 

4.50 

4.50 

.36 

HN 

GLY18 

HN 

GLN19 

2.44  (.08) 

2.70 

3.25 

.26 

HN 

GLN19 

HN 

VAL20 

2.51  (.05) 

2.70 

3.20 

.19 

HN 

VAL20 

HN 

PHE21 

4.40  (.02) 

4.50 

4.50 

.10 

HN 

PHE21 

HN 

TYR22 

4.61  (.03) 

4.50 

4.50 

.11 

HN 

TYR22 

HN 

ARG23 

4.10  (.07) 

4.50 

4.50 

.40 

HN 

ARG23 

HN 

THR24 

4.60  (.06) 

4.50 

4.50 

.10 

HN 

THR24 

HN 

GLY25 

3.53  (.04) 

4.50 

4.50 

.97 

HN 

GLY25 

HN 

ASP26 

4.36  (.04) 

3.00 

4.50 

.00 

HN 

ASP26 

HN 

IS027 

4.41  (.04) 

3.00 

4.50 

.00 

HN 

IS027 

HN 

IS028 

4.49  (.05) 

4.50 

4.50 

.04 

HN 

IS028 

HN 

GLY29 

4.58  (.04) 

3.00 

3.75 

.83 

HN 

GLY29 

HN 

ASP30 

4.38  (.05) 

4.50 

4.50 

.12 

HN 

ASP30 

HN 

IS031 

4.31  (.04) 

4.50 

4.50 

.19 

HN 

IS031 

HN 

ARG32 

4.38  (.01) 

4.50 

4.50 

.12 

HN 

ARG32 

HN 

LYS33 

2.78  (.36) 

2.50 

3.25 

.05 

HN 

LYS33 

HN 

ALA34 

4.37  (.08) 

2.50 

3.25 

1.12 

HN 

ALA34 

HN 

TYR35 

1.90  (.08) 

2.50 

3.25 

.60 

HN 

TYR35 

HN 

CYS36 

4.28  (.07) 

4.50 

5.50 

.22 

HA 

GLY16 

HD 

PRO  17 

2.18  (.04) 

2.20 

2.70 

.03 

HN 

GLY18 

HGl 

VAL20 

5.31  (.37) 

5.00 

6.00 

.03 

HA 

GLY25 

HB 

ASP26 

4.87  (.22) 

5.00 

6.00 

.18 

HB 

ASP30 

HA 

ARG32 

5.91  (.19) 

4.50 

5.50 

.41 

HG2 

IS027 

HA 

GLY29 

4.83  (.29) 

3.00 

4.00 

.83 

HB 

THR3 

HN 

ARG4 

3.06  (.20) 

3.00 

4.00 

.06 

HN 

THR24 

HA 

IS027 

8.32  (1.75) 

4.50 

5.50 

2.82 

HA 

GLY25 

HN 

IS027 

6.71  (.11) 

4.50 

5.50 

1.21 

the  ((p,  \|/)-values  for  different  residues.  The  average  is  taken  over  the  37  models  that 
are  finally  selected  as  structural  solutions.  The  residues  close  to  the  C-terminal  are 
most  flexible  in  terms  of  their  (cp,  \|/)-values  (Figure  9)  and,  consequently,  in  terms  of 
the  spatial  locations  of  their  C°-atoms  (Figure  8B).  Although  we  located  two  turns 
(one  close  to  the  N-terminal  and  the  other  close  to  the  C-terminal),  the  presence  of 
two  Gs  and  intrinsically  flexible/dynamic  (R32-K33-A34-Y35)  segment  (as  dis¬ 
cussed  above)  make  the  C-terminal  half  of  the  V3  loop  intrinsically  more  flexible 
part  of  the  molecule.  Table  III  containing  the  average  values  and  the  standard 


0  10  20  30 


Figure  9;  The  average  values  and  the  corresponding  standard  deviations  of  the  (cp,  \|/)-values  of  different 
residues  in  the  ChiangMai  V3  loop.  37  structures  shown  in  Figure  7B  are  considered.  The  residues  close  to 
the  C-terminal  show  the  largest  standard  deviations.  The  Cartesian  coordinates  of  all  100  V3  loop  models 
can  be  obtained  from  the  authors  on  request 


Figure  8:  Description  of  the  Thailand 
TN243y3  loop  structure 


(A)  A  ribbon  diagram  of  the  average 
folding  pattern  in  agreement  with  the 
2D  NMR  data  presented  in  Figures  2- 
7  and  in  Table  II  of  the  supplementary 
material.  The  secondary  structural 
elements  are  color  coded:  [3- 
strand/extended  =  green,  turn  and  coil 
=  blue.  The  sulfur  atoms  of  C2  &  C36 
involved  In  the  (S-S)  bridge  are 
shown  in  magenta.  The  residue  C2  is 
on  the  left  and  C36  is  on  the  right.  The 
immunogenic  tip  containing  GPG  at 
the  center  forms  a  protruding  surface. 
The  first  turn  close  to  the  N-terminal 
contains  the  residue  N7*  (the  site  of 
N-linked  glycosylation). 


(B)  Superposition  of  22  lowest  energy 
structures  in  agreement  with  the  2D 
NMR  data  in  two  different 
orientations.  The  rms  deviations  of 
these  structures  are  40  +/-10  A  with 
respect  to  the  1 00  distances  obtained 
as  experimental  structural 
constraints.  Note  that  only  C^-atoms 
are  shown  for  clarity.  The  residues 
close  to  the  C-terminal  show  greater 
flexibility.  The  van  der  l/l^aa/ surfaces 
of  N7*  and  Q1 9  are  shown.  Note  that 
both  functionally  determinant  sites  are 
exposed  to  the  environment.  Out  of  37 
finally  selected  structures,  only  22 
models  that  do  not  show  close  overlap 
are  shown  here.  In  different  structures 
the  relative  positions  of  Q19  are 
different  with  respect  to  C2  and  C36 
but  not  the  local  structure  of  G16- 
P17-G18-Q19  and  the  associated 
solvent  accessibility  of  Q1 9. 
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deviations  of  all  torsion  angles  [i,e.,  ((p,\|/,  X^  etc.)]  is  also  included  in  the  sup¬ 
plementary  material. 

The  molecular  diagrams  in  Figures  8A  &  8B  and  (cp,  \|/)- values  in  Figure  9  show  that 
most  of  the  individual  amino  acids  reside  in  the  extended  conformation.  This  con¬ 
clusion  is  based  upon  the  observation  of  typical  sequential  H^(i)-HN(i+ 1)  NOESY/ 
ROESY  connectivities  and  values  of  the  J(H“-HN)  coupling  values  shown  in  Figures  3 
and  7.  The  H“(i)-HN(i+ 1)  NOESY/ROESY  connectivities  alone  cannot  justify  an 
extended  conformation.  For  example,  such  connectivities  are  often  observed  also  in 
a  "random  coil"  in  which  each  residue  samples  several  (q),  \}/)-values.  For  some  of  the 
(cp,  \i/)-values,  the  H“(i)-HN(i+ 1)  distance  can  be  short  enough  to  show  NOE/ROE. 
However,  for  a  random  coil,  an  average  value  of  J(H“-HN)  coupling  should  be 
observed,  i.e.,  a  value  of  about  6-7  Hz.  The  cp-value  corresponding  to  an  average 
value  of  J(H^-HN)  coupling  may  never  be  adopted  by  a  residue.  This  is  especially 
true  when  a  residue  samples  around  two  distinctly  separated  cp-values  (perhaps 
separated  by  a  barrier),  i.e.,  (1)  the  alpha-helical  region  ((p=50°+30‘")  giving  rise  to 
an  average  J(H“-HN)  coupling  of  5  Hz  and  (2)  the  region  of  extended  conformation 
(((p=  150°  ±30°)  giving  rise  to  an  average  J(H“-HN)  coupling  of  9  Hz.  We  have  deter¬ 
mined  J(H“-HN)  couplings  for  various  residues  from  the  DQF-COS  Y  data  (Figures 
3  and  7).  We  ascribe  extended  conformation  only  to  those  residues  that  simultaneously 
show  strong  or  medium  H“(i)-HN(i±  1)  NOESY/ROESY  connectivities  and  J(H“- 
HN)  couplings  greater  than  or  equal  to  10  Hz. 

Discussion 

As  explained  in  the  Introduction,  accurate  determinations  of  (1)  the  global  tertiary 
fold  and  (2)  the  local  structure  of  the  PND  are  needed  for  explaining  the  antigenic 
specificity  of  the  V3  loop.  In  addition,  the  fact  that  the  hypervariability  of  the  amino 
acid  sequence  flanking  the  conserved  GPG-sequence  results  in  the  variability  in  the 
antigenic  specificity  of  the  V3  loop,  makes  it  necessary  to  perform  a  systematic  study 
to  quantify  the  effect  of  the  amino  acid  sequence  variability  on  the  structure/ 
antigenicity  of  the  V3  loop.  We  show  in  this  article  that  the  residues  flanking  the 
GPG-sequence  of  the  Thailand  TN243  V3  loop  adopt  extended  conformation.  The 
protruding  PND  surface,  which  contains  5-7  amino  acids  contiguous  in  space  and 
in  sequence,  can  be  recognized  by  an  antibody  via  "direct  reading  mechanism." 
However,  if  the  amino  acids  in  the  regions  flanking  the  GPG-sequence  are  folded 
into  an  a-  or  3  jQ-helix,  the  protruding  PND  surface  can  still  be  formed  by  5-7  amino 
acids  contiguous  in  space  but  not  necessarily  in  sequence.  In  such  an  "indirect  read¬ 
ing  mechanism,"  residues  in  the  interior  of  the  antigenic  tip  also  play  an  important 
role  in  deciding  the  size  and  shape  of  the  cavity  (and  hence  also  the  size  and  shape  of 
the  surface).  We  have  identified  a  more  folded  conformation  at  the  GPG-crest  of  the 
HIV-IIIB  V3  loop  theoretically  in  presence  of  a  monoclonal  antibody,  and  have 
subsequently  verified  the  presence  of  such  a  conformation  by  site-specific  mutations 
[22,23].  The  goal  of  such  studies  is  to  classify  V3  loop  sequences  in  terms  of  their  (1) 
global  tertiary  fold  and  (2)  local  structure  of  the  PND,  rather  than  by  their  amino 
acid  sequence.  Determination  and  classification  of  structural/antigenic  properties 
of  the  V3  loops  will  be  useful  in  vaccine  development. 
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Table  III 


The  average  values  of  all  torsion  angles  and  the  corresponding  standard  deviations.  37  structures  are 
used.  Note  that  the  residues  close  to  the  C-terminal  cystine  (C36)  show  the  largest  deviations  in  their  tor¬ 
sion  angles.  NOESY/ROESY  data  contains  information  about  ((p,\|/,x'). 


0) 

x' 

X^ 

x^ 

X^ 

x’ 

C2 

81 

-64 

-161 

178 

5 

7 

7 

10 

T3 

-75 

-53 

150 

50 

165 

-43 

12 

4 

10 

18 

17 

28 

R4 

-86 

106 

-159 

-168 

177 

176 

-173 

1 

179 

0 

5 

10 

9 

21 

3 

18 

26 

46 

2 

3 

P5 

-75 

84 

-176 

0 

8 

3 

S6 

-151 

177 

175 

-84 

-54 

8 

3 

6 

24 

55 

N7 

-85 

64 

176 

-171 

-104 

0 

2 

3 

3 

1 

1 

0 

N8 

-160 

169 

-179 

-90 

104 

0 

2 

5 

2 

5 

1 

1 

T9 

-99 

122 

174 

-59 

71 

176 

3 

14 

2 

2 

2 

4 

R  10 

-155 

144 

-178 

-109 

-77 

-176 

-82 

3 

-177 

0 

9 

4 

4 

6 

4 

5 

3 

4 

2 

2 

T  11 

-141 

175 

175 

-171 

83 

-173 

3 

4 

2 

1 

13 

1 

S  12 

-170 

137 

-165 

-83 

56 

2 

5 

9 

3 

6 

I  13 

-158 

118 

-177 

-176 

168 

-44 

67 

5 

5 

2 

2 

2 

3 

1 

T  14 

-177 

-166 

-175 

178 

97 

55 

3 

2 

3 

24 

18 

8 

I  15 

-145 

120 

-177 

-174 

168 

-45 

68 

2  2  111  2  1 
G  16  145  108  178 

2  2  2 

P  17  -75  69  179 

0  3  2 

G  18  109  -16  177 

2  2  1 

E  19  -75  -13  -172  -160  176  98  -179 

2  3  4  3  2  0  1 

V20  -144  161  -178  78  63  40 

3  3  2  8  4  5 

F21  -81  137  -178  -178  81 

3  6  4  2  3 

Y22  -168  136  172  173  79  -179 

7  4  3  11  13  3 

R23  -29  124  -167  -111  -73  -63  156  -168  3  4 

443  14  7  4  9  5  29  1 

T24  -166  96  -178  174  81  -36 

5  3  6  4  21  1 

G25  154  -172  172 

25  22  5 

D26  -143  150  179  -56  -82 

12  24  3  17  4 
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Table  III  continued 


(p 

V 

(0 

x‘ 

yt 

x’ 

127 

-107 

144 

153 

-131 

179 

-51 

-45 

14 

9 

2 

19 

18 

18 

19 

128 

-88 

146 

-175 

-46 

177 

-173 

-42 

7 

3 

2 

4 

4 

17 

1 

G29 

145 

-160 

178 

3 

12 

3 

D30 

-151 

151 

177 

54 

99 

3 

3 

3 

2 

4 

131 

-149 

168 

-178 

-159 

153 

-169 

62 

2 

3 

4 

15 

21 

20 

8 

R32 

-117 

51 

169 

-44 

-70 

172 

-171 

174 

179 

178 

7 

20 

6 

5 

3 

11 

4 

3 

1 

1 

K33 

-104 

-179 

-156 

75 

157 

-170 
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56 

13 

7 

5 

5 

14 

8 

30 

7 

A  34 

-163 

13 

-151 

-55 

11 

10 

8 

2 

Y35 

-176 

-160 

168 

-147 

-123 

0 

9 

11 

6 

4 

4 

41 

C36 

-127 

89 

179 

-173 

9 

17 

1 

16 
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Abstract 

Human  mucins  are  T  or  S  glycosylated  tandem  repeat  proteins.  In  breast  cancer,  mucins 
become  under  or  un^ycosyiated.  Two-dimensional  nuclear  magnetic  resonance  experiments  are 
performed  on  chemically  synthesized  mucin  tandem  repeat  polypeptides,  (PDTRPAPGST- 
APPAHGVTSA)n  in  the  unglycosylated  form  for  n=  1,3  where  (APDTR)  constitutes  the 
antigenic  sites  for  the  antibodies  isolated  from  the  tumors  in  the  breast  cancer  patients.  These 
studies  demonstrate  how  the  tandem  repeats  assemble  in  space  giving  rise  to  the  overall  ter¬ 
tiary  structure,  and  the  local  structure  and  presentation  of  the  antigenic  site  (APDTR)  at  the 
junction  of  two  neighboring  repeats.  The  NMR  data  reveal  repeating  knob-like  structures 
connected  by  extended  spacers.  The  knobs  protrude  away  from  the  long-axis  of  Muc-1  and 
the  predominant  antigenic  site  (APDTR)  forms  the  accessible  tip  of  the  knob.  Multiple  tan¬ 
dem  repeats  enhance  the  rigidity  and  presentation  of  the  knob-like  structures. 

Introduction 

Human  mucins  are  a  family  of  high  molecular  weight,  heavily  glycosylated  proteins 
which  are  dominated  by  large  tandem  repeat  (TR)  domains  (1-4).  Mucin  tandem 

*  Author  to  whom  correspondence  should  be  addressed. 

^  This  work  was  supported  by  the  LANL  grant  XL  77. 

Abbreviations  Three  Muc-1  tandem  repeats  (3TRX  one  Muc-1  tandem  repeats  (lTR)3-trimethylsilylprop- 
ionate  (TSP),  Monte  Carlo  (MC),  parts  per  million  (ppm),  Rapid  Multiple  Peptide  Synthesizer  (RaMPS), 
trifluoroacetic  acid  (TFA),9-fluorenylmethyloxycarbonyl  (Fmoc),  total  correlation  spectroscopy  (TOCSY), 
rotating  fiame  Overhauser  effect  spectroscopy  (ROESY),  nuclear  Overhauser  effect  spectroscopy  (NOESY), 
double  quantum  filtered  correlated  spectroscopy  (DQF-COSY),  end-to-end  length  of  protein  (Re),  major 
histocompatability  complex  (MHC),  P-D-JV-acetylgalactosamine  (GalNac),  Single  letter  amino  acid 
codes  are  as  follows:  A=alanine,  D=aspartic  acid,  G=glycine,  H=histidine,  I-isoleucine,  P=proline, 
R= arginine,  S = serine,  T = threonine,  V = valine 
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repeat  domains  vary  in  size,  proline  content,  and  potential  extent  of  glycosylation. 
Underglycosylation  of  the  human  mucin  Muc-1  tandem  repeat  domain  in  certain 
breast,  pancreatic  and  ovarian  tumors  results  in  the  unmasking  of  protein  core 
epitopes  (5-8).  Tumor  reactive,  mucin-specific  monoclonal  antibodies  reveal  dif¬ 
ferences  between  the  surface  of  Muc-1  derived  from  tumors  and  normal  tissues 
(6,9,10).  Synthetic  peptide  studies  showthat  most  tumor  specific  antibodies  recognize  an 
epitope  within  the  tandem  repeat  protein  core  of  Muc-1  (11,12). 

Humoral  immune  responses  and  the  epitope  specificity  of  antibodies  to  the  tandem 
repeats  of  Muc-1  raise  interesting  structural  questions  (the  sequence  in  each  repeat 
is  PDTRPAPGSTAPPAHGVTSA).  For  example,  every  monoclonal  antibody  that 
has  been  shown  to  be  specific  for  the  protein  core  of  the  TR  domain  of  Muc- 1 ,  and  in 
which  the  fine  specificity  has  been  mapped,  recognizes  some  or  all  of  the  sequence 
A0-P1-D2-T3-R4-P5-A6  (AO  is  the  last  residue  from  the  previous  repeat).  This 
includes  19/19  different  monoclonal  antibodies  to  the  protein  core  reviewed  by 
Devine  and  Mckenzie  (13)  and  Taylor-Papadimitriou  (10).  In  addition,  mucin 
specific  IgM  humoral  immunity  which  has  been  detected  in  sera  from  patients  with 
breast  and  ovarian  cancer  has  been  determined  to  be  specific  for  the  AO-P 1-D2-T3- 
R4-P5  as  well  (14,15).  The  induction  of  exclusively  IgM  antibodies  in  cancer  patients 
suggests  a  mechanism  of  cross-linking  of  Ig  receptors  by  multiple  epitopes  includ¬ 
ing  the  sequence  A0-P1-D2-T3-R4-P5.  The  questions  that  immediately  arises  are: 
Why  is  the  antibody  specificity  always  centered  on  the  sequence  A0-P1-D2-T3-R4- 
P5?  Is  there  a  structural  explanation  for  the  immunodominance  of  this  epitope? 

Tumor  reactive  Muc-1  specific  cytotoxic  T-cell  lines  have  been  described  from 
breast  and  pancreatic  cancer  patients  in  which  it  was  indicated  that  the  antigen 
specificity  was  the  intact  protein  core  of  Muc-1  expressed  by  these  tumors,  and  not 
the  processed  and  presented  forms  (7,16).  These  a/p  T-cells  showed  a  lack  of  MHC- 
restriction,  that  is  they  were  able  to  kill  both  breast  and  pancreatic  tumor  cell  lines 
from  individuals  with  different  MHC  alleles  (16).  Additional  studies  demonstrated 
that  tumor  reactive  non-MHC  restricted  CTL  lines  and  clones  could  be  blocked  by 
monoclonal  antibodies  to  the  same  immunodominant  epitope  (A0-P1-D2-T3-R4-P5)  of 
the  Muc-1  tandem  repeat,  and  not  by  an  antibody  to  HLA  class  I  (W632)  (17,18).  TTie 
multivalent  nature  of  the  TR  domain  suggests  a  possible  mechanism  for  cross- 
linking  of  T-cell  receptors  in  the  process  of  MHC  unrestricted  T  cell  activation  (19). 

We  have  undertaken  the  current  study  to  gain  insight  into  the  structural  basis  of  the 
observed  antibody  immumodominance  and  MHC-unrestricted  immunogenicity 
of  Muc-1  tandem  repeats.  The  primary  structural/functional  question  raised  by 
Muc-1,  concerns  the  relationship  between  numbers  of  tandem  repeats  and  the  for¬ 
mation  of  the  immunodominant  antigenic  determinant  site,  A0-P1-D2-T3-R4-P5. 
Two-dimensional  nuclear  magnetic  resonance  (2D  NMR)  spectroscopy  on  chemically 
synthesized  mucin  tandem  repeat  polypeptides,  (PDTRPAPGSTAPPAHGVTSA) 
and  (PDTRPAPGSTAPPAHGVTSA)3,  provide  information  about  the  global  ter¬ 
tiary  folding  (and  associated  packaging)  of  the  TR  domain  and  the  local  structure  of 
the  antigenic  determinant.  These  structural  studies  on  Muc-1  TR  with  various 
repeatnumbers,n,  reveal  thatlargernumberofrepeats(n  >  2)  enhance  the  definition  of 
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the  immunodominant  site.  In  addition,  we  leam  how  these  sites  are  spatially  located 
and  oriented  with  respect  to  each  other. 

Materials  and  Methods 

Peptide  Synthesis 

Tandem  repeat  peptides  ITR  and  3TR  corresponding  to  1  tandem  repeat  (20  residues) 
and  3  tandem  repeats  (60  residues)  are  synthesized  by  a  manual  solid-phase  strategy 
using  9-fluorenylmethyloxycarbonyl  protected  amino  acids.  The  final  products  are 
peptide  amides.  The  procedures  for  synthesis,  purification,  and  characterization  of 
the  peptide  products  are  described  in  detail  elsewhere  (20).  Briefly,  20,  and  60  amino 
acid  peptides  are  synthesized  using  the  Rapid  Multiple  Peptide  Synthesizer  (RaMPS) 
apparatus  from  Dupont  (Boston,  MA).  For  3TR,  once  the  peptide  chain  reached  30 
residues,  the  resin  was  split  in  half  and  separated  into  two  reaction  cartridges  to 
allow  space  for  the  peptide  chains  to  expand  in  the  cartridge.  Once  the  resins  were 
divided,  the  concentration  of  input  amino  acid  was  maintained  at  .5  mM  in  order  to 
drive  the  coupling  reaction  to  completion  with  high  efficiency.  The  products  of  the 
synthesis  were  deprotected  and  cleaved  from  the  resin  support  in  concentrated 
trifluoroacetic  acid  (TFA)  in  the  presence  of  the  appropriate  scavengers.  The  TFA 
soluble  products  were  extracted  sequentially  in  organic  solvents  and  then  transferred  to 
water  and  lyophilized.  The  peptides  were  purified  by  conventional  gel  filtration  and 
reverse-phase  high  pressure  liquid  chromatography  (HPLC).  Molecular  weight 
characterizations  of  the  peptide  products  are  performed  with  an  electrospray  mass 
spectrometer  which  verified  that  the  desired  products  were  obtained  with  high  yield. 
The  correct  molecular  weight  product  is  purified  by  high  pressure  liquid  chromatography 
(HPLC)  prior  to  NMR  analysis. 

NMR  Experiments  and  Sequential  Assignments 

All  the  spectra  were  recorded  in  a  Bruker  500  MHz  AMX  instrument  at  10  C  with  5 
mM  peptide  concentration  in  .01  M  phosphate  buffer  (pH  5.5).  All  2D  data  were 
acquired  in  the  phase-sensitive  mode  with  the  saturation  of  the  HDO  signal  during 
the  relaxation  delay.  DQF-COSY  data  were  collected  with  the  following  acquisition 
parameters:  data  matrix  (t2=2048,tl  =  1024);  relaxation  delay  =  1.5  s;  number  of 
transients  =  32.  TOCSY  data  were  collected  with  the  data  matrix  (t2=2048,tl  =  1024); 
relaxation  delay  =  1.5  s;  number  of  transients  =  32;  isotropic  mixing  =  60ms; 
MLEV-64  pulse  sequence  for  spin  lock.  NOESY  data  were  collected  with  similar 
acquisition  parameters  and  for  500  and  200  ms  of  mixing.  ROESY  data  were  collec¬ 
ted  in  the  phase-sensitive  mode  using  the  CW-spinlock  pulse  sequence  with  the 
following  acquisition  parameters:  data  matrix  (t2= 2048,tl =512);  mixing  time  =150 
ms;  relaxation  delay  =  1.5  s;  number  of  transients  =  32.  The  sequential  assignment 
of  3TR  is  obtained  by  combining  TOCSY,  DQF-COSY,  and  NOESY  data. 

Extraction  of  Inter-proton  Distances  as  Structural  Constraints  from  the  NMR  Data 

This  step  involves  obtaining  an  energetically  stable  structure  given  the  secondary 
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structural  states  of  the  constituent  amino  acid  residues  as  obtained  by  analyzing  the 
sequential  NOESY  and  ROESY  pattern.  Appropriate  ranges  of  (<p  ,  \|a)  values  are 
assigned  to  all  amino  acids.  For  example. 


<p  =  -55°±25°, 
(p  =  -140°±50°, 

(Pi+i  =-65°±20°, 
(Pi+2  =-90°±20“, 
(Pi+i  =-65°±20°, 
(Pi+2  =  90“  ±20°, 


V 

=  -55‘ 

=  ±25' 

¥ 

=  140' 

'±50' 

¥i+i 

=  -50' 

=  ±20 

¥i+2 

=  0' 

'±20' 

¥i+i 

=  120' 

'±20' 

Vi+2 

=  0' 

±20' 

for  residues  in  a  helix; 

for  residues  in  a  beta  strand  or 

in  an  extended  conformation; 

for  residues  in  a  type-I  turn; 

for  residues  in  a  type-H  turn. 


((p,  vp)  of  residues  in  the  coil  state  are  set  free  to  choose  any  values  in  the  allowed 
space  (for  definitions  of  different  secondary  structures  and  corresponding  ((p,  \p)- 
values,  see  Ramachandran  &  Sasisekharan,  1968  (21)). 


We  obtain  starting  mucin  structures  by  a  linked-atom-least-square  refinement  by 
minimizing  a  function,  F,  only  in  the  ((p,  \|/)  space. 

F  =  R-Factor  +  (djjmn  .  omn)2 

,  mo-IcI  [1] 

R-Factor  = 

Zlo 


lo  =  observed  NOESY  intensity  and  Ic  =  calculated  NOESY  intensity  by  full- 
matrix  NOESY  simulations;  dy™"  =  the  actual  distance  between  atom  i  (type  m)  and 
atom  j  (type  n)  and  D™”  =  corresponding  contact  limits  between  atom  types  m  and 
n.  The  sum  extends  over  all  pairs  (i  j)  of  spins  for  which  NOESY  cross-peaks  are 
observed.  Full-Matrix  NOESY  simulations  with  respect  to  experimental  data  at  two 
mixing  times  (one  low  and  another  high)  enable  us  to  include  both  primary  and 
higher  orders  of  NOEs.  Thus,  the  complications  in  the  distance  estimate  using  a 
two-spin  model  often  encountered  at  a  high  mixing  time  due  to  spin-diffusion  (i.e., 
higher  order  NOEs)  are  avoided  in  the  Full-Matrix  NOESY  simulations  where  all 
spins  are  considered  in  the  relaxation  (22).  Such  a  simulation  at  two  mixing  improves 
the  rigor  of  distance  estimates. 


In  this  refinement,  first  the  (cp,  \|/)  values  of  various  residues  are  only  elastically 
varied  (i.e.,  variables  with  weights)  such  that  by  appropriate  choice  of  weights  the 
experimentally  determined  secondary  structural  states  of  residues  are  minimally 
altered  (23).  Finally  the  potential  energy  of  the  1TR/3TR  sequence  is  minimized  in 
the  ((p,  vp,  ft),  x)  space  using  the  force-field  of  Scheraga  and  co-workers  (24).  The 
minimization  of  the  function,  F,  in  the  (cp,  \|/)  space  followed  by  the  energy- 
minimization  in  the  ((p,  \p,  co,  x)  space  is  repeated  for  30  different  starting  structures 
(25).  The  set  of  30  different  structures  are  chosen  such  that  they  are  confor- 
mationally  different;  by  considering  the  positions  of  the  Ca  atoms  the  rms  deviations 
among  the  chosen  30  structures  are  greater  than  2.5  A.  As  discussed  later,  such  a 
large  deviation  is  possible  because  of  the  flexibility  in  the  mucin  spacer.  Although 
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the  residues  in  the  spacer  remain  in  the  extended  conformation,  the  correlated 
motions  in  the  backbone  torsion  angles  of  these  residues  can  cause  a  large  change 
without  altering  the  sequential  NOEs.  The  choice  of  30  different  starting  structures 
duly  account  for  the  whole  range  of  spacer  flexibility.  At  the  end  of  this  procedure, 
we  obtain  a  set  of  30  models  in  agreement  with  the  NOESY  and  ROESY  data.  From 
these  models,  a  set  of  inter-proton  distances  are  extracted  as  structural  constraints 
required  for  agreement  with  the  NOESY/ROESY  data.  Each  pairwise  distance  rep¬ 
resenting  a  structural  constraint  provides  an  upper  and  a  lower  limit  of  the  distance. 
Two  types  of  constraints  are  identified. 

'type  I 

This  is  given  as 

EDIST=  0  if  the  distance  r  is  within  a  specified  range  rl  &  r2 
=  k(r-rl)^  if  r  <  rl 

=  k(r-r2)^  if  r  >  r2.  k  :  force  constant. 


type  n 

This  is  given  as 

EDIST=  0ifr>rl 
=  k(r-rl)^  if  r  <  rl. 

This  type  is  particularly  useful  for  an  unobserved  NOE  where  we  can  set  a  lowest 
allowable  distance  limit  for  the  corresponding  proton  pair.  Analyses  of  the  2D 
NMR  data  of  3TR  produced  220  inter-proton  distance  constraints. 

Examination  of  the  Conformational  Flexibility  Subject  to  the  NMR  Data 

The  energy  term,  EDIST,  is  added  to  the  force-field  as  in  Scheraga  and  co-workers 
(24).  The  simulated  annealing  is  performed  (26)  in  the  following  manner.  First,  a 
starting  energy-minimized  structure  is  chosen  and  Monte  Carlo  (MC)  simulations 
are  performed  for  50,000  steps  at  lOOOK  in  the  (q),  \|/,  co,  x)-space;  the  last  accepted 
conformation  is  stored  to  be  subsequently  used  as  a  starting  conformation  in  the 
next  lower  temperature-cycle.  Second,  50,000  MC  steps  are  repeated  in  several  cycles 
of  gradually  decreasing  temperature  until  a  temperature  of  lOOK  is  reached.  Third, 
the  lowest  energy  configuration  at  lOOK  is  further  energy-minimized  to  a  low  energy 
gradient.  This  is  the  "temperature  quenching"  step  in  which  thermally  excited  single 
bond  rotations  around  the  equilibrium  positions  are  quenched.  Finally,  the  first 
three  steps  are  repeated  for  120  different  starting  conformations  which  comprise  of 
the  30  starting  structures  obtained  after  the  step  discussed  in  the  previous  paragraph 
and  90  other  structures  as  obtained  after  carrying  out  first  through  third  steps  of  the 
simulated  annealing  protocol.  The  maximum  step  size  of  the  torsion  angles  was  set 
at  15  degrees  which  produced  acceptance  ratios  of  .20-.50  for  the  50,000  step  MC 
cycle  at  each  temperature.  Full-Matrix  NOESY  calculations  were  re-computed  for 
the  final  120  Muc-1  3TR  structures.  Details  of  the  simulated  annealing  procedure 
and  its  application  in  efficient  conformational  search  have  already  been  described 
elsewhere  (22, 27). 
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D2  ( ppm ) 

Figure  1:  ROESY  cross-section  of  the  finger  print  HN-H“  region  of  5  mM  ITR  in  W/oHjO-l-  lOyoDjO  at 
10®C  (10  mM  phosphate  buffer  with  pH  5.5). 

Results 

NMR  Data  of  a  Single  Muc-l  Tandem  Repeat  (ITR) 

For  ITR,  the  total  correlation  spectroscopy  (TOCSY)  and  rotating  frame  Overhauser 
effect  spectroscopy  (ROESY)  data  are  used  to  arrive  at  the  sequential  assignment  of 
the  protons  belonging  to  the  constituent  amino  acids;  then  ROESY  data  are  used  to 
estimate  various  inter-proton  distances.  2D  NMR  studies  on  ITR,  show  a  flexible 
extended  conformation  of  the  molecule.  The  flexibility  and  the  dynamics  of  the 
molecule  is  such  that  the  effective  correlation  time  for  most  of  the  pair-wise  inter¬ 
proton  interactions  are  fast  enough  to  elude  the  observation  of  the  corresponding 
nuclear  Overhauser  experiment  (NOESY)  cross-peaks  in  the  laboratory-frame 
NOESY  experiments.  However,  several  inter-residue  interactions  are  identified  in 
the  ROESY  experiment  (28,  29).  Figure  1  shows  the  HN-H“  or  the  finger-print 
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region  of  the  ROESY  spectrum  of  ITR  The  prominent  sequential  H“(i)-HN(i+ 1) 
and  the  absence  of  HN(i)-HN(i+ 1)  NOEs  indicate  an  extended  conformation.  The 
absence  of  most  of  the  inter-proton  interactions  in  the  NOESY  (30, 31)  indicates  that 
the  extended  stmcture  is  flexible.  In  addition,  it  is  also  evident  from  the  NMR  data  of  ITR 
that  the  (A)PDTR-tum  is  not  formed  in  the  absence  of  A  from  the  previous  repeat 

Previous  studies  have  shown  that  an  unglycosylated  11  residue  Muc-1  peptide  can 
contain  residual  p-tum  stmcture  (32, 33).  Scanlon  et  al.  found  a  type  I  turn  through  the 
residues  D2-T3-R4  using  one  complete  tandem  repeat  p(l-20)  for  (34)  and  in  three  nine 
residue  mucin  or  substituted  mucin  peptides  (35).  The  difference  in  the  type  of  turn  and 
the  one  residue  shift  could  be  attributed  to  a  preferential  stabilization  of  fee  type  I  p-tum 
conformation  in  fee  organic  solvent,  dimethyl-sulfoxide  used  in  this  work.  However, 
these  studies  reveal  a  tendency  toward  fee  formation  of  p-tums  within  this  region. 

NMR  Data  of  Three  Muc-1  Tandem  Repeats  (3TR) 

Stmctural  analyses  of  3TR  (PDTRPAPGSTAPPAHGVTSA)3,  involve  a  more 
extensive  data  set  that  includes  the  TOCSY  and  NOESY  data  for  sequential  assign¬ 
ment,  double  quantum  filtered  correlated  spectroscopy  (DQF-COSY)  and  the 
NOESY  data  at  500  and  200  ms  of  mixing  times  for  stmcture  derivation.  Table  I  lists  fee 
chemical  shift  values  of  fee  protons  belonging  to  fee  residues  in  3TR  The  observation  of 
interproton  interactions  in  fee  500  and  200  ms  of  mixing  time  NOESY  experiments 
indicates  that  3TR  is  stmcturaUy  more  ordered  than  ITR  Figure  2A  and  2B  show 

Table  I 

Chemical  Shift  Values  for  the  Central  Tandem  Repeat 


NH 

other 

A20/40 

8.43 

4.63 

1.39 

3.81/3.68  (H^ 

P21/41 

4.42 

2.30 

1.92 

D22/42 

8.51 

4.64 

2.74/2.67 

T23/43 

8.14 

4.33 

4.27 

1.20 

3.22  (H®) 

R24/44 

8.32 

4.62 

1.80 

1.71 

P5/25/45 

4.41 

2.28 

1.88 

3.83/3.62  (H®) 

A6/26/46 

8.54 

4.59 

1.38 

3.85/3.68  (H®) 

P7/27/47 

4.43 

2.32 

1.96 

G8/28/48 

8.60 

3.99/3.95 

S9/29/49 

8.22 

4.54 

3.95/2.90 

TlO/30/50 

8.28 

4.39 

4.25 

1.21 

All/31/51 

8.34 

4.61 

1.36 

3.86/3.64  (H*) 

P 12/32/52 

4.71 

2.36 

1.90 

P13/33/53 

4.39 

2,28 

1.86 

3.83/3.63  (H^ 

A14/34/54 

8.47 

4.25 

1.35 

H15/35/55 

8.47 

4,69 

3.26/3.20 

7.28  (4H) 

G16/36/56 

8.47 

4.00/3.97 

V17/37/57 

8.20 

4.24 

2.13 

T18/38 

8.39 

4.43 

4.25 

1.22 

S 19/39 

8.37 

4.47 

3.85 

Chemical  shift  values  in  ppm  of  the  protons  belong  to  20  central  amino  acids  in  3TR  at  10°C  and  at  pH 
5.5.  The  chemical  shift  values  are  given  with  respect  to  3-trimethylsilylpropionate  (TSP)  as  the  internal 
standard.  The  sequential  assignment  is  obtained  by  combining  TOCSY  and  NOESY  data. 


of  the  knobs  relative  to  one  another.  (A)  (left)  The  elongated  structure  with  the  knobs  protruding  from  the  same  face  of  the  molecule.  Re  = 
82A  in  this  structure.  (B)  (middle)  A  representative  structure  with  intermediate  Re  (=  46A)  with  the  knobs  facing  opposite  directions.  (C) 
(right)  A  compactly  folded  form  of  Muc-1  with  Re  =  sA  have  the  knobs  oriented  toward  two  corners  of  an  equilateral  triangle. _ 
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the  NOESY  cross-sections  (mixing  time  =  500  ms)  for  HN  vs.  H“  and 
HN  vs.  HN  regions.  Several  features  of  an  ordered  structure  are  evident  from  Figure 
2.  For  example,  excepting  a  few  residues  at  the  N-  and  C-terminal,  the  same  residues 
in  different  repeats  experience  the  same  chemical  shift  environment  and  also  show 
equivalent  inter-residue  NOE  patterns  as  would  be  expected  for  a  tandemly  repeat¬ 
ing  structure. 

Local  Structure  of  the  Immunodominant  Epitope 

The  diagnostic  NOE  pattern  of  importance  is  the  one  involving  the  immunodomi¬ 
nant  epitopes,  i.e.,  A20-P21-D22-T23-R24  and  A40-P41-D42-T43-R44.  The  sequen¬ 
tial  H“(i)-HN(i+ 1)  and  HN(i)-HN(i+ 1)  NOEs  are  indicative  of  a  type-II  p-tum  of 
APDTR  centered  around  P  and  D  (36).  Figure  2A  shows  the  strong  H\P)-H“(A) 
and  H“(P)-HN(D)  NOEs  that  are  characteristics  of  a  tight  type-II  p-tum;  in  addi¬ 
tion,  Figure  2B  shows  the  inter-residue  HN(D)-HN(T)  NOEs  are  consistent  with  a 
type-II  p-tum.  In  contrast,  a  type  I  p-tum  at  this  position  should  produce  weak 
sequential  H“(i)-HN(i-l- 1)  and  H“(i-I-  l)-HN(i+2)NOEs  (37).  This  pattern  of  NOEs 
for  the  type  II  p-tum  in  the  immunodominant  epitopes  A20-P21-D22-T23-R24  and 
A40-P41-D42-T43-R44  of  3TR  are  the  same  as  described  for  the  principal  neutraliz¬ 
ing  determinant  (PND)  with  sequence  GPGRAthat  is  located  inside  the  third  vari¬ 
able  (V3)  loop  of  the  human  immunodeficiency  vims  type  I  (HIV-l).  Incidentally, 
HIV  GPGR  is  favored  to  form  a  stable  type  II  p-tum  (23, 38).  For  example,  the  NOE 
pattern  observed  for  the  PND  fragment  of  HIV  (GPGRA)  is  strong  H  (P(i+1))- 
H“(G(i)),  H“(P(i+l))-HN(G(i-h2)),  medium/strong  H“(G(i+2))-HN(R(i+3))  and 
medium/weak  HN(G(i+2))-HN(R(i+3))  (22,  39).  The  same  pattern  of  NOEs  are 
observed  for  the  mucin  immunodominant  epitope  involving  residues  APDTR  as 
shown  Figures  2A  and  2B.  The  NMR  observations  of  a  type  II  p-tum  in  the  HIV- 
PND  sequence  (GPGR)  were  verified  subsequently  with  crystallographic  evidence 
(40).  Wilmot  and  Thornton,  using  a  set  of  59  protein  crystal  stmctures,  found  a 
sequence  preference  for  type  II  p-tums  that  included  proline  in  the  i-l- 1  position 
followed  by  either  G  or  N  in  the  i-l-2  position  (38).  When  N  is  present  at  the  i-t-2  posi¬ 
tion,  the  ((Pi+2  >  Vi+2)  values  shift  to  the  region  3  of  the  Ramachandran  plot  with 
cPi+2=60±30  and  v|/i+2= 60±30  from  the  ideal  values  of  (90, 0),  which  is  stereochemically 
only  possible  for  G.  Such  a  departure  from  the  type  II  p-tum  helps  to  accommodate 
the  heavier  sidechain  of  N.  This  departure  also  weakens  the  C=0(i)...HN(i+3)  H- 
bond.  But  the  energetic  loss  due  to  the  weakening  of  the  H-bond  in  the  type  II  P-tum 
with  N  at  the  (i-l-2)  position  is  always  compensated  by  the  electrostatic  interactions 
of  the  N  with  neighboring  residues  (38).  The  type  II  p-tums  (APDTR)  in  3TR  also 
show  the  experimental  evidence  for  the  formation  of  a  salt  bridge  through  the  side 
chains  of  D22/R24  and  D44/R44.  We  could  observe  R  side  chain  NH2  group  signals 
that  were  broad  but  low-field  shifted  to  10  ppm  in  the  spectmm  of  3TR  in  water  as 
measured  by  the  "jump  and  return"  method  (41)  (absent  in  the  spectmm  of  ITR). 
This  low-field  shift  of  the  R  side  chain  NH2  proton  signals  demonstrates  partial 
burial  of  these  groups  from  the  solvent.  Salt  bridge  formation  with  the  COO-  group 
of  D  and  NH2-I-C-NH2-I-  group  of  R  is  indeed  possible  when  the  immunodomi¬ 
nant  epitope  APDTR  takes  a  tight  turn  as  described  above.  Scanlon  et  al.  also  found 
evidence  for  salt  bridge  formation  between  the  side  chains  of  D2  and  R4  using  a  20 
residue  peptide  dissolved  in  dimethyl  sulfoxide  (34).  Although,  Scanlon  et  al.  propose  a 
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pattern  observed  for  the  PND  fragment  of  HIV  (GPGRA)  is  strong  H®(P(i+l))- 
H“(G(i)),  H“(P(i+l))-HN(G(i+2)),  medium/strong  H“(G(i+2))-HN(R(i+3))  and 
medium/weak  HN(G(i+2))-HN(R(i+3))  (22,  39).  The  same  pattern  of  NOEs  are 
observed  for  the  mucin  immunodominant  epitope  involving  residues  APDTR  as 
shown  Figures  2A  and  2B.  The  NMR  observations  of  a  type  II  p-tum  in  the  HIV- 
PND  sequence  (GPGR)  were  verified  subsequently  with  crystallographic  evidence 
(40).  Wilmot  and  Thornton,  using  a  set  of  59  protein  crystal  structures,  found  a 
sequence  preference  for  type  II  p-tums  that  included  proline  in  the  i+ 1  position 
followed  by  either  G  or  N  in  the  i + 2  position  (3  8).  When  N  is  present  at  the  i+ 2  posi¬ 
tion,  the  ((Pj+2  5  Vi+2)  values  shift  to  the  region  3  of  the  Ramachandran  plot  with 
(Pi+2=60±30  and  \|/i+2=60±30  from  the  ideal  values  of  (90, 0),  which  is  stereochemically 
only  possible  for  G.  Such  a  departure  from  the  type  II  p-tum  helps  to  accommodate 
the  heavier  sidechain  of  N.  This  departure  also  weakens  the  C=0(i)...HN(i+3)  H- 
bond.  But  the  energetic  loss  due  to  the  weakening  of  the  H-bond  in  the  type  II  p-tum 
with  N  at  the  (i+2)  position  is  always  compensated  by  the  electrostatic  interactions 
of  the  N  with  neighboring  residues  (38).  The  type  II  p-tums  (APDTR)  in  3TR  also 
show  the  experimental  evidence  for  the  formation  of  a  salt  bridge  through  the  side 
chains  of  D22/R24  and  D44/R44.  We  could  observe  R  side  chain  NH2  group  signals 
that  were  broad  but  low-field  shifted  to  10  ppm  in  the  spectmm  of  3TR  in  water  as 
measured  by  the  "jump  and  return"  method  (41)  (absent  in  the  spectmm  of  ITR). 
This  low-field  shift  of  the  R  side  chain  NH2  proton  signals  demonstrates  partial 
burial  of  these  groups  from  the  solvent.  Salt  bridge  formation  with  the  COO-  group 
of  D  and  NH2+-C-NH2+  group  of  R  is  indeed  possible  when  the  immunodomi¬ 
nant  epitope  APDTR  takes  a  tight  turn  as  described  above.  Scanlon  et  al.  also  found 
evidence  for  salt  bridge  formation  between  the  side  chains  of  D2  and  R4  using  a  20 
residue  peptide  dissolved  in  dimethyl  sulfoxide  (34).  Although,  Scanlon  et  al.  propose  a 
type  I  turn  at  (P1-D2-T3-R4)  with  D2  and  T3  as  the  central  residues  of  the  turn.  The  pre¬ 
sence  of  a  strong  H“(D2>HN(T3)  disfavors  such  a  turn.  However,  a  type  I  (P 1-D2-T3-R4) 
could  conceivably  co-exist  with  the  type  II  turn  at  A0-P1-D2-T3.  Regardless,  the  presence 
of  a  turn  of  any  kind  creates  a  protmding  knob  at  AO-P 1-D2-T3-R4.  The  absence  of  this 
tight  turn  in  ITR  clearly  indicates  that  residues  at  the  TR  interface  contribute  to  the  for¬ 
mation  of  a  well  defined  APDTR  turn.  Finally ,  when  all  three  APDTR  segments  in  3TR 
were  replaced  by  (GPGRA)  of  the  HIV- 1  PND  we  observed  that  the  latter  was  stmc- 
turally  isomorphous  with  same  sequence  inside  the  V3  loop  and  with  the  APDTR  of 
3TR  All  these  data  are  consistent  with  a  type  11  p-tum  of  the  APDTR  sequence  (42). 

Figure  3  shows  the  finger-print  region  of  the  DQF-COSY  data;  the  J(HN-H“)  coupling 
constants  in  the  figure  reveal  (()-values  in  the  rangeof±80to  180.  The  J(HN-H“)  coupling 
constants  for  the  internal  tandem  repeat  of  Muc-1  3TR  are  as  follows  D(9.3),  T(8.9), 
R(13.5),  A(8.3),  G(1 1.0),  S(8.7),  T(9.2),  A(13.5),  A(7.7),  H(9.9),  G(12.1),  V(8.9),  T(8.9),  S(8.3), 
A(9.8).  The  J(HN-H“)  coupUng  constants  for  the  terminal  residues  are  as  follows  R4  (9.8), 
T58  (8.4),  S59  (1 1.0),  A60  (8.1).  The  D2  cross  peak  was  not  observed  in  this  cross  section. 
The  J  coupling  constants  are  estimated  using  FELIX  software;  the  J  coupling  values 
are  therefore  approximate  and  only  provide  the  upper  bounds. 

Knob-Like  Structures  of  Muc-1  3TR 

As  explained  mMethods,  NOESY  data  at  200  and  500  ms  of  mixing  are  used  for  distance 
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Figure  4:  Conserved  structural  features  of  the  tandem  repeat  domain  of  Muc-1  include;  Repeating  and  prot¬ 
ruding  knob-like  structures  consisting  of  sequential  reverse  turns  that  span  tandem  repeat  interfaces  (residues 
17-27, 37-47),  an  extended  region  composed  of  polyproline  n  and  p-strand  structure  (residues  10-15, 30-35, 50- 
55).  The  N-  and  C-  terminal  2-3  residues  are  unordered  due  to  the  absence  of  adjoining  tandem  repeats. 

are  used  for  determining  the  J(HN-“H)  coupling  constants  required  for  estimating 
various  (p  angles.  A  total  of  220  distance  constraints  are  derived  using  a  full-matrix 
NOESY  simulation  (22);  in  this  method  all  orders  of  NOE  are  considered  as  data 
not  merely  the  primary  NOEs.  Therefore,  the  spin-diffusion  problem  (i.e.  the  presence  of 
higher  order  NOEs  is  not  a  problem  but  are  included  in  the  data  set  The  45  torsional  and 
the  220  distance  constraints  are  then  incorporated  in  a  Monte  Carlo  (MC)  simulated 
annealing  procedure  to  sample  different  3TR  conformations  that  agree  with  the  NMR 
data  (26, 43).  Simulated  annealing  constrained  to  the  NMR  data  results  in  120  energy- 
minimized  structures  for  3TR  All  of  the  structures  share  critical  features  as  shown  in 
Figure  4  —  i.e.,  (1)  a  solvent  exposed  protruding  knob  with  the  immunodominant 
APDTR  turn  at  the  tip,  (2)  the  protmding  knob  spanned  residues  V17-P27A^37-P47 
thereby  including  the  last  4  residues  from  the  previous  repeat,  (3)  the  knobs  are  connec¬ 
ted  by  an  extended  spacer  structure  (comprising  of  P-strand  and  poly-proline  confor¬ 
mations)  for  residues  30-35,  and  (4)  extended  conformations  are  also  detected  for  the 
residues  10-15  and  50-55.  The  conformational  difference  among  the  sampled  structures 
is  the  difference  in  the  end-to-end  length  (R^)  and  the  relative  spatial  orientation  of  the 
two  knobs.  The  predominant  portion  of  the  sampled  structures  show  R^  =  70-90  A  (data 
not  shown).  The  large  extent  of  variation  in  the  R^  (=  30-90  A)  prompted  us  to  examine 
the  nature  of  variation  in  the  backbone  torsion  angles  of  the  sampled  structures  of  3TR 
that  finally  produce  the  variation  in  the  R^.  Table  n  lists  the  average  values  and  the  stan¬ 
dard  deviations  of  the  backbone  and  side  chain  torsion  an^es  for  the  20  central  residues 
in  3TR  It  is  seen  that  the  standard  deviations  in  the  backbone  torsion  angles  are  below  10 
degrees  excepting  for  a  few  residues.  Therefore,  small  changes  in  the  backbone  torsion 
angles  correlated  over  several  residues  gives  rise  to  large  variations  in  R^.  The  sam¬ 
pled  structures  all  show  similar  agreement  with  the  NMR  data.  Figure  5  shows  three 
such  structures  of  3TR  for  long,  intermediate,  and  short  R^.  Conserved  structural 
features  (1)"(4),  described  above,  are  retained  in  all  three  structures.  However,  ter¬ 
tiary  chain-folding  patterns  are  different  in  these  structures.  The  structure  of  3TR 
with  the  long  R^  (Figure  5A)  has  the  protruding  immunodominant  knobs  arranged 
on  a  linear  extended  chain;  the  structure  with  the  intermediate  R^  has  the  knobs  in 
opposite  orientations  (Figure  5B);  the  structure  with  the  short  R^  has  the  knobs 
pointing  out  from  the  two  comers  of  a  triangle  (Figure  5C).  These  stmctures  illus¬ 
trate  an  unusual  characteristic  of  Muc-1,  the  ability  to  present  multiple  copies  of  an 
antigenic  determinant  with  flexibility  in  the  intervening  sequences. 


Table  n 

Torsion  Angles  and  Standard  Deviations  for  the  Internal  Tandem  Repeat 
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Antigenic  tip  of  Muc-1 

Figure  6:  Structure  of  the  immunogenic  knob  comprising  of  the  central  eleven  residues,  (V17-T18-S19- 
A20-P21-D22-T23-R24-P25-A26-P27).  Note  that  residues  3747  in  the  same  molecule  show  almost  identical 
structure.  All  the  low-eneigy  NMR-derived  stmctuies  show  similar  structures  for  residues  17-27  or  3747. 

Figure  6  shows  the  structure  of  the  immunogenic  knob  comprising  of  the  central 
eleven  residues,  (V17-T18-S19-A20-P21-D22-T23-R24-P25-A26-P27).  Such  a  knob 
configuration  is  common  to  all  the  low  energy  structures  of  Muc-1  3TR.  The  local 
structure  of  the  residues  1 8-26  is  virutally  indistinguishable  to  that  of  residues  38-46. 
A  close  examination  of  the  knob  reveals  a  double  bend  at  the  tip.  Two  bends  are  cen¬ 
tered  around  P21-D22  and  D22-T23.  These  two  consecutive  bends  bring  the  sidechain  of 
R24  in  favorable  electrostatic  contacts  with  the  backbone  C=0  and  sidechain 
COO-  of  D22.  This  also  results  in  the  burial  of  the  T23  sidechain.  Such  a  conforma¬ 
tion  is  consistent  with  the  NH-NH  NOEs  between  (D22&T23)  and  (T23&R24)  and 
strong  H“-NH  NOES  between  (P21&D22).  One  might  argue  that  such  NOEs  may 
also  due  to  a  mixture  of  type  II  turn  centered  at  P21-D22  and  a  type  I  turn  centered  at 
D22-T23.  If  that  is  true,  the  conformational  equilibrium  mast  be  fast  on  the  NMR 
time-scale  because  we  do  not  observe  resonance  doubling  for  the  residues  21-24. 
The  immunogenic  knob  shown  in  Figure  6  retains  a  type  II  turn  at  P21-D22  and  T23 
shows  the  same  conformation  as  in  a  type  I  turn  (see  Table  II).  Nevertheless,  what  is  most 
important  in  the  stmcture  is  the  surface  exposure  of  the  critical  APDTR  residues. 
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Discussion 

The  results  of  the  present  2D  NMR  investigations  of  ITR  and  3TR  reveal  that  no 
significant  structure  exists  in  one  copy  of  the  mucin  tandem  repeat  (PDTRPAPGS- 
TAPPAHGVTSA),  and  that  the  peptide  with  three  tandem  repeats  exhibits  flexible 
structures  with  well  defined  knob-like  structure  centered  around  the  antigenic 
(APDTR)  site.  Previously,  we  compared  peptides  containing  one,  two,  and  three- 
tandem  repeats  and  showed  changes  occurring  throughout  the  one-dimensional 
NMR  spectrum  (44).  We  concluded  feat  fee  secondary  structure  develops  as  additional 
tandem  repeats  are  added.  The  NMR  investigations  were  further  supported  by 
results  with  circular  dichroism  in  which  the  ratio  of  the  molar  ellipticity  at  198  nm  of 
one,  three  or  five  tandem  repeats  to  that  of  a  control  peptide  was  3.2  for  one  tandem 
repeat,  1 3.6  for  three  tandem  repeats  and  21.0  for  5.25  tandem  repeats  (20).  This  non¬ 
linear  increase  in  molar  elliptcity  can  be  attributed  to  the  development  of  structure 
as  the  number  tandem  repeats  increases.  Furthermore,  we  presented  a  model  for  the 
structure  of  Muc-1-1  tandem  repeats  that  was  suggested  by  Matsushima  et  al,  for 
proline  rich  tandem  repeats  (44, 45).  We  predicted  an  elongated  structure  composed 
of  repeating  type  I  turns.  The  actual  structure  for  the  Muc-1  tandem  repeat  domain 
is  shown  in  Figure  5,  and  differs  significantly  from  the  earlier  model  in  that  it  con¬ 
tains  repeating  solvent  exposed  protruding  knobs  spanning  the  tandem  repeat 
interface.  The  knob  structures  (V17-P27A^37-P47)  are  crested  by  the  immunodomi¬ 
nant  APDTR  turn  at  the  tips,  and  sequential  knobs  are  connected  by  extended 
spacer  structures  for  residues  T30-A35,  and  T10-A15  and  T50-55A  The  physical 
dominance  of  the  knob  structure  at  the  tandem  repeat  interface  was  surprising  and 
yet  serves  to  explain  much  of  the  observed  immunoreactivity  (10, 13, 16). 

The  stable  turn  structure  and  protruding  nature  of  the  repeating  epitope  APDTR 
could  obscure  access  to  the  remainder  of  the  molecule  by  B  and  T  cell  antigen  recep¬ 
tors  and  help  explain  the  observed  antibody  immunodominance  of  APDTR  and 
MHC  unrestricted  activation  of  T  cells  by  unglycosylated  mucin  on  tumors  (10, 16). 
A  parallel  MHC  restricted  T  cell  response  could  be  generated  to  peptides  from  the 
processed  Muc-1  tandem  repeats  (17).  A  restricted  T  cell  response  would  depend  on 
the  presence  of  proteolytic  processing  sites  within  Muc-1  and  the  ability  of  pro¬ 
cessed  peptides  to  bind  to  class  I  MHC  proteins,  and  finding  restricted  T  cells  would 
not  be  incompatible  with  the  interpretation  of  these  data  (46). 

The  enzymatic  o-glycosylation  pattern  of  the  T  residues  of  3TR  is  particularly  rele¬ 
vant  to  the  knob-like  structures  shown  in  Figures  4  and  5. 2D  NMR  studies  recently 
carried  out  in  our  laboratory  in  collaboration  with  Dr.  Henrick  Clausen  shows  that 
enzymatic  glycosylation  of  3TR  produces  T  residues  o-glycosylated  with  p-D-iV- 
acetylgalactosamine  (GalNac)  in  TS  and  ST  sequence  doublets  and  not  at  the  T  of 
the  APDTR  turn.  The  tight  turn  at  APDTR  where  the  OH  group  of  T  is  buried  inside 
the  turn  could  explain  the  protection  of  this  T  from  glycosylation.  An  alternative 
explanation  for  the  lack  of  glycosylation  at  the  T  within  the  PDTRP  is  that  the 
primary  sequence  may  not  signal  for  glycosylation  at  this  position  (47).  In  general, 
the  elongated  structures  shown  in  Figure  5  A  can  easily  accommodate  glycosylation 
at  the  potential  T  and  S  sites  without  steric  problems.  This  implies  that  a  mucin 
molecule  containing  at  least  thirty-six  20-residue  tandem  repeats  (when  glycosylated) 
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could  span  more  than  1,000  A  from  the  cell  surface  and  thereby  function  as  an  anti¬ 
adhesion  molecule  (48).  This  is  the  first  study  that  shows  the  protein  core  of  the  TR 
domain  intrinsically  contains  an  elongated  structure  studded  with  immunodomi¬ 
nant  knobs  that  protect  the  T  residue  in  the  APDTR  but  allows  o-glycosylation  of 
the  remaining  solvent  exposed  T  and  S  residues  and  that  this  could  lead  to  further 
stabilization  of  elongated  structures.  Muc-1  in  the  extended  state  cannot  simply  be 
accommodated  inside  the  cell.  Therefore,  the  types  of  compaction  shown  in  Figures 
5B  and  5C  become  relevant  in  the  context  of  packaging  mucin  inside  the  cell. 

The  structural  studies  presented  here  assume  a  special  importance  because  under- 
glycosylated  mucin  repeats  are  tumor  associated  antigens  in  breast  (7),  pancreatic 
(16),  and  ovarian  (17)  cancers.  NMR  data  on  1  and  3  Muc-1  tandem  repeats  clearly 
show  how  the  immunodominant  knob  structures  are  preserved  and  presented  only 
when  multiple  tandem  repeats  are  assembled  in  space.  The  knob-like  structures  on 
intact  Muc-1  may  obscure  access  to  the  extended  portions  of  the  molecule  by 
antibody  receptors  on  the  surface  of  B-lymphocytes  during  the  induction  of  an 
immune  response.  This  may  explain  the  immunodominance  of  the  accessible  tip 
(APDTR)  of  the  knobs.  Preliminary  work  further  indicates  that  higher  numbers  of 
tandem  repeats  (n=5)  enhance  the  rigidity  and  presentation  of  the  knobs.  Hence  it 
is  not  surprising  that  previous  studies  have  shown  that  increasing  the  number  of 
tandem  repeats  from  3  to  5  results  in  a  nonlinear  enhancement  in  the  binding  of 
antibodies  in  the  serum  of  cancer  patients  ( 1 5).  Therefore,  the  combination  of  struc¬ 
tural  and  immunological  studies  of  this  type  will  be  important  for  the  rational 
design  of  vaccines  and  immunotherapies  for  Muc-1  positive  tumors 

Acknowledgments 

We  thank  Dr.  Cliff  Unkefer  and  Dr.  Jill  Trewella  of  Los  Alamos  National  Laboratory, 
and  Dr.  Dave  Scott  of  the  University  of  Iowa  for  use  of  their  500  MHz  NMR  spec¬ 
trometers.  We  also  thank  Dr.  Ron  Montelaro  for  support  during  the  early  phase  of 
this  work.  This  work  was  supported  by  the  US  Army  Grant  RH03  to  GG. 

References  and  Footnotes 

1.  S.  Gendler,  J.  Taylor-Papadimitriou,  T.  Duhlig,  J.  Rothbard  and  JA.  Burchell,  J.  Biol  Chem.  263, 
12820-12823  (1988). 

2.  J.R.  Gum,  J.C.  Byrd,  J.W.  Hicks,  N.W.  Toribara,  D.TA  Lamport  and  T.S.  kim,/  Biol  Chem.  264, 
6480-6487  (1989). 

3.  J.R.  Gum,  J.W.  Hicks,  D.M.  Swallow,  RL.  Lagace,  J.C.  Byrd,  D.T  A  Lamport,  B.  Siddiki  and  Y.S. 
l^m.,Biochem.  Biophys.  Res.  Commun.  171, 407-415  (1990). 

4.  N.  Porchet,  N.V.  Cong,  J.  Dufosse,  J.P.  Audie,  V.  Guyonnet-Duperat,  M.S.  Gross,  C.  Denis,  P. 
Degand,  A  Bemheim  and  J.P.  Aubert,  Biochem.  Biophys.  Res.  Commun.  175, 414-422  (1991). 

5.  J.  Hilkens,  F.  Buijs  and  M.  Ligtenberg,  Cancer  Res.  49,  786-793  (1989). 

6.  A  Girling,  J.  Bartkova,  J.  Burchell,  S.  Gendler,  C.  Gillet  and  J.  Taylor-Papadimitriou.,  Int.  J.  Can¬ 
cer.  43, 1072-1076.  (1989). 

7.  K.R.  Jerome,  D.L.  Bamd,  K.M.  Bendt,  C.M.  Boyer,  J.  Taylor-Papadimitriou,  I.F.C.  McKenzie, 
RC.B.  Jr.  and  OJ.  Finn.,  Cancer  Res.  51, 2908-2916  (1991). 

8.  KR  Jerome,  D.  Bu  and  OJ.  Finn.,  Cancer  Res.  52,  5985-5990.  (1992). 

9.  S.  Sell,  Progress  Path.  21, 1003-1019.  (1990). 

10.  J.  Taylor-Papadimitriou,  Int.  J.  Cancer  49, 1-5.  (1991). 

11.  P.X.  Xing,  J.J.  Tjandra,  S.A  Stacker,  J.G.  Teh,  C.H.  Thompson,  PJ.  McLaughlin  and  I.F.C. 
McKenzie, Cell  Biol  67, 183-195  (1989). 


260 


Fontenot  et  al. 


12.  J.  Burchell,  S.  Gendler,  J.  Taylor-Papadimitriou,  A  Girling,  A  Lewis,  R  Millis  and  D.  Lamport, 
Cancer  Res.  47,  5476-5482  (1987). 

13.  P.L.  Devine  and  I.F.C.  McKenzie,  BioEssays  14,  619-625  (1992). 

14.  A  Rughetti,  V.  Turchi,  C A  Ghetti,  G.  Scambia,  P.B.  Panici,  G.  Ronucci,  S.  Mancuso,  L.  Frati  and 
M.  Nuti,  Cancer  Res.  53, 2457-2459  (1993). 

15.  Y.  Kotera,  J.D.  Fontenot,  G.  Pecher,  RS.  Metzgar  and  OJ.  Finn,  Cancer  Res.  54,  2856-2860 
(1994). 

16.  D.L.  Bamd,  M.  Lan,  R  Metzgar  and  OJ.  Finn,  Proc.  Natl  Acad.  Sci.  USA,86, 7159-7163  (1989.). 

17.  C.G.  loannides,  B.  Fisk,  KR  Jerome,  T.  Irimura,  J.T.  Wharton  and  OJ.  Finn,  /  Immunol  151, 
3693-3703  (1993). 

18.  KR  Jerome,  N.  Domenech  and  O.J.  Finn,  /  Immunol  151, 1654-1662  (1993). 

19.  O  J.  Finn,  Biotherapy  4, 239-249  (1992). 

20.  J.D.  Fontenot,  OJ.  Finn,  N.  Dales,  P.C.  Andrews  and  RC.  Montelaro,  Pept.  Res.  6,  330-336 
(1993). 

21.  G.N.  Ramachandran  and  V.  Sasisekharan,^t/v.  Prot.  Chem.  23, 283-437  (1968). 

22.  G.  Gupta,  G.M.  Anantharamaiah,  D.R  Scott,  J.H.  Eldridge  and  G.  Meyers,  /  Biomol  Struct,  and 
Dynam.  11,  345-366  (1993). 

23.  G.  Gupta  and  G.  Meyers,  Computer  analysis  of  HIV  epitope  sequences,  1-99-105  Pasteur  Vaccins, 
Paris,  1990. 

24.  G.  Nemethy,  M.S.  Pottle  and  HA  Scheraga,/  Phys.  Chem.  87,  1883-1887  (1983). 

25.  R  Fletcher,  Practical  methods  in  optimization  1,  John  Wily  &  Sons,  New  York,  1984. 

26.  S.  Kirkpatrick,  C.D.G.  Jr.  and  M.P.  Vecchi,  Science  220, 671-680  (1983). 

27.  G.  Gupta  and  G.  Meyers,  Analyses  of  various  folding  patterns  of  the  HIV-1  loop,  (Birkhauser, 
Boston,  1994). 

28.  A  Bax  and  D.G.  Davis,  /  Magn.  Res.  63, 207-213  (1985). 

29.  A  A  Bothner-BY,  RL.  Stephens,  J.-m.  Lee,  C.D.  Warren  and  RW.  Jeanloz,  /  Am.  Chem.  Soc.  106, 
811-813(1984). 

30.  J.  Jeener,  B.H.  Meier,  P.  Bachman  and  RR  Ernst,  /  Chem.  Phys.  71, 4546-4553  (1979). 

31.  S.  Macura  and  RR  Ernst,  Mol  Phys.  41, 95-117  (1980). 

32.  M.R  Price,  F.  Hudecz,  C.  O'Sullivan,  RW.  Baldwin,  P.M.  Edwards  and  S.J.B.  Tendler, 
Immunol  27, 795-802  (1990). 

33.  S.  Tendler,  Biochem.  J.  267, 733-737  (1990). 

34.  MJ.  Scanlon,  S.D.  Morley,  D.E.  Jackson,  M.R  Price  and  SJ.B.  Tendler,  Biochem.  J.  284,  Ul- 
144  (1992). 

35.  S  J.B.  Tendler,  M.J.  Scanlon  and  M.R.  Price,  Protein  and  Peptide  Letters  1,  39-43  (1994). 

36.  D.R  Rose,  L.M.  Gierasch  and  J.A  Smith,  ^/v.  Prot.  Chem.  37, 1-109  (1985). 

37.  K  Wuthrich,  NMR  of  proteins  and  nucleic  acids,  John  Wiley  &  Sons,  New  York,  1986. 

38.  C.M.  Wilmot  and  J.M.  Thornton,  /  Mol  Biol  203, 221-232  (1988). 

39.  K  Chandrasekhar,  AT.  Profy  and  H.J.  Dyson,  Biochemistry  30, 9187-9194  (1991). 

40.  J.B.  Ghiara,  EA  Stura,  RL.  Stanfield,  AT.  Profy  and  lA  Wilson,  Science  264,  82-85  (1994). 

41.  P.  Plateau  and  M.  Gueron,/  Am.  Chem.  Soc.  104,  7310-7311  (1982). 

42.  J.D.  Fontenot,  J.M.  Gatewood,  S.V.S.  Mariappan,  C.P.  Pau,  B.S.  Parekh,  J.R  George  and  G. 
Gupta,  Proc.  Natl  Acad.  Sci  USA,  In  Press  (1994). 

43.  F.D.M.  Veronese,  M.S.R  Jr.,  G.  Gupta,  M.  Robert-Guroff,  C.  Boyer-Thompson,  RC.  Gallo  and  P. 
Lusso,  /  Biol  Chem.  268, 25894-25901  (1993). 

44.  J.D.  Fontenot,  N.  Tjandra,  D.  Bu,  C.  Ho,  RC.  Montelaro  and  O.J.  Finn,  Cancer  Res.  53,  5386- 
5394  (1993). 

45.  N.  Matsushima,  C.E.  Creutz  and  RH.  Kretsinger.,  Proteins:  Structure,  Function  and  Genetics.  7, 125- 
155  (1990). 

46.  E.E.  Secarz,  P.V.  Lehmann,  A  Ametani,  G.  Benichou,  A  Miller  and  K  Moudgil,  Annu.  Rev. 
Immunol  11,  729-766  (1993). 

47.  I.  Nishimori,  F.  Perini,  KP.  Mountjoy,  S.D.  Sanderson,  N.  Johnson,  RL.  Cemy,  M.L.  Gross,  J.D. 
Fontenot  and  MA  Hollingsworth,  Cancer  Res.  54, 3738-3744  (1994). 

48.  MJ.L.  Ligtenberg,  F.  Buijs,  H.L.  Vos  and  J.  Hilkens,  Cancer  Res.  52, 2318-2324  (1992). 

Date  Received:  December  10, 1994 

Communicated  by  the  Editor  Ramaswamy  H.  Sarma 


T  Journal  of  Biological  Chemistry 


Vol.  268.  No,  34,  Issue  of  December  5.  pp  25"d4-^oy<.'l 

Printed  in  L  N-A. 

I 

Loss  of  a  Neutralizing  Epitope  by  a  Spontaneous  Point  Mutation  in 
the  V3  Loop  of  HIV-1  Isolated  from  an  Infected  Laboratory  Worker 

(Received  for  publication,  May  20,  1993,  and  in  revised  form,  August  16,  1993) 


Fulvia  di  Marzo  Veroneset§,  Marvin  S.  Reitz,  Jr.H,  Goutam  Guptall  Marjorie  Rnbert-GuroffH, 
Cynthia  Boyer-Thompsont,  Audrey  LouieU,  Robert  C.  GaUoH,  and  Paolo  LussoH 


The  third  hypervariable  region,  or  V3  loop,  represents 
the  principal  neutralizing  domain  of  the  gpl20  envelope 
glycoprotein  of  human  immunodeficiency  virus  type  1 
(HIV-1).  Sequential  viral  isolates  from  a  laboratory 
worker  (LW)  accidentally  infected  with  HIV-lniB  »* * * §»  19^5 
were  analyzed  using  type-specific  neutralizing  mono¬ 
clonal  antibodies  directed  to  the  V3  loop.  A  single  ammo 
acid  substitution,  Ala  -  Thr  at  position  21  in  the  V3  loop 
of  HIV-Ilw  isolated  in  1987,  was  shown  to  determine  the 
loss  of  the  neutralizing  epitope  recognized  by  one  of  the 
monoclonal  antibodies  (M77).  However,  this  antibody  ef¬ 
ficiently  recognized  linear  V3  loop  peptides  conteining 
either  the  Ala  or  Thr  residue  at  position  21,  indicating 
that  a  local  change  in  conformation  was  responsible  for 
the  epitope  loss  in  the  native  gpl20.  Molecular  modeUng 
studies,  experimentally  supported  by  different  amino 
acid  replacements  at  position  21,  indicated  that  the  Ala 
Thr  substitution  leads  to  a  drastic  change  in  the  do¬ 
main  of  the  V3  loop,  which  contains  the  complementary 
surface  for  antibody  binding.  These  results  provide  evi¬ 
dence  for  the  first  time  that  a  conformation-dependent 
epitope  within  the  V3  loop  of  HIV-1  is  involved  in  the 
generation  of  neutralization  escape  mutants  in  vivo. 


The  primary  translational  product  of  the  envelop  gene  of 
the  human  immunodeficiency  virus  type  1  (HIV-1)^  is  a  gpl60 
which  is  processed  to  gpl20  and  gp41  as  the  external  and 
transmembrane  glycoproteins,  respectively  (Veronese  et  al., 
1985-  Allan  et  al.,  1985),  Because  of  its  location  on  the  surface 
of  the  virion  and  of  the  infected  cell,  gpl20  contains  epitopes 
naturally  accessible  to  the  immune  system  and  thus  represents 
a  prime  candidate  for  the  development  of  vaccine  strategies. 
Indeed,  a  domain  of  gpl20,  situated  in  the  third  hypervariable 
region,  has  been  identified  as  dominant  for  development  of  high 
titered  type-specific  neutralizing  antibodies  against  HIV-1  (Ja- 
haverian  et  al.,  1989;  Rusche  et  o/.,  1988;  Palker  et  al.,  1988). 
This  principal  neutralizing  domain  lies  within  a  loop  formed  by 
a  disulfide  bridge  between  2  conserved  cysteines.  A  distinctive 
characteristic  of  this  loop  is  the  extensive  degree  of  genetic 
variability  found  among  individual  isolates  of  HIV-1  (Robert- 
Guroff,  1990).  This  hypervariability,  a  likely  consequence  of 
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strong  immune  selective  pressures  in  vivo,  suggests  that  mu¬ 
tations  in  the  primary  amino  acid  sequence  of  the  neutralizing 
loop  become  prevalent  as  viruses  which  escape  neutralization 
are  selected. 

Different  approaches  have  been  followed  in  order  to  identity 
amino  acid  changes  which  are  critical  for  neutralization.  One 
approach  involved  the  in  vitro  generation  of  neutralization  es¬ 
cape  mutant  viruses  under  selective  pressure  of  human  sera 
from  HIV-infected  individuals  (Robert-Guroff  et  al.,  1986)  or 
monoclonal  antibodies  to  the  V3  loop  (McKeating,  1989).  In 
vitro  studies  with  human  neutralizing  antibodies  were  the  first 
to  demonstrate  that  HIV  mutants  are  selected  by  immune  pres¬ 
sures  (Robert-Guroff  et  al.,  1986).  In  addition,  these  studies 
identified  a  point  mutation  in  gp41  (Ala  ->  Thr  replacement  at 
position  582)  which  resulted  in  neutralization  resist^ce  (Reitz 
et  al.,  1988)  but  was  apparently  not  part  of  the  antibody  bind¬ 
ing  site  (Wilson  et  al.,  1990),  indicating  the  involvement  of 
discontinuous  epitopes  in  HIV  neutralization.  Molecular  anal- 
ysis  of  mutant  viruses  obtained  by  in  vitro  selection  with  mono¬ 
clonal  antibodies  demonstrated  amino  acid  changes  within  the 
V3  loop  itself  (McKeating  et  al.,  1989),  but  more  interestingly 
also  in  regions  distantly  located  from  the  loop,  suggesting  an 
interaction  of  this  domain  with  other  regions  of  gpl20 
(McKeating  et  al,  1989).  Another  approach  involved  the  anal¬ 
ysis  of  sequential  viral  isolates  obtained  from  chimpanzees  ex¬ 
perimentally  infected  with  a  laboratory  strain  of  HIV-1  (HTV- 
1,„b).  The  in  vivo  passage  of  this  isolate  restated  in  the 
generation  of  variant  viruses  which  were  resistant  to  neutral¬ 
ization  by  V3  loop-specific  monoclonal  antibodies  (Kara  et  al, 
1990).  Sequence  analysis  of  the  V3  loops  from  these  mutants 
did  not  reveal  any  amino  acid  difference  between  the  neutral¬ 
ization-resistant  and  -sensitive  viruses,  suggesting  that  the 
crucial  changes  did  not  directly  involve  the  antibody  binding 
site  (Nara  et  al,  1990). 

In  this  report,  we  studied  the  unmunological  reactivity  and 
neutralizing  capability  of  two  monoclonal  antibodies  to  the  V3 
domain  of  HW-Ihib  with  biologically  active  molecular  clones 
from  sequential  viral  isolates  obtained  from  a  laboratory 
worker  who  became  accidentally  infected  with  HIV-Iihb  (Weiss 
et  al,  1988).  This  unique  in  vivo  human  model  allowed  us  to 
identify  a  single  change  in  the  V3  loop  primary  amino  acid 
sequence  which  is  crucial  for  neutralization  resistance  by  V3 
loop-specific  monoclonal  antibodies.  Moreover,  we  have  per¬ 
formed  molecular  modeling  studies,  experimentally  supported 
by  amino  acid  replacements,  to  show  that  this  single  amino  acid 
substitution  may  cause  a  drastic  change  in  the  domain  of  the 
V3  loop  which  provides  the  complementary  surface  for  antibody 
binding. 

EXPERIMENTAL  PROCEDURES 

Neutralization  Assay  of  Cell-free  Virus— Ascitic  fluids  from  the  two 
monoclonal  antibodies  were  filtered  through  a  0.2-pm  filter  and  di- 
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luted  in  complete  RPMI  1640  medium.  Aliquots  of  the  respective  anti¬ 
body  were  incubated  with  HIV-lniB  virus  stock  containing  25,000  cpm 
reverse  transcriptase  activity  in  100  pi  final  volume  for  60  min  at  4  "C 
and  then  for  15  min  at  room  temperature.  CEM  cells  (1  x  10^)  were 
then  added  to  the  virua/antibody  mixture  and  incubated  for  60  min  at 
37  ®C.  The  cells,  still  in  the  presence  of  virus  and  antibody,  were 
supplemented  with  2  ml  of  complete  RPMI  medium  and  transferred  to 
six-well  plates.  Two  ml  of  RPMI  medium  were  added  to  each  well  after 
24  h,  and  the  reverse  transcriptase  activity  of  the  supernatant  was  de¬ 
termined  on  the  first  day  when  syncytia  were  clearly  visible  by  micro¬ 
scopic  examination. 

Syncytium  Inhibition  Assay — The  syncytia  assay  was  performed  in 
96-well  plates  by  coculturing  1  x  10®  uninfected  cells  with  5  x  10^ 
virus-infected  cells.  Monoclonal  antibodies  were  added  at  different  di¬ 
lutions  to  the  mixture  of  cells,  and  the  total  volume  of  medium  was 
adjusted  to  0.2  ml.  The  plates  were  incubated  for  40  h  at  37  °C,  and  the 
number  of  multinucleated  giant  cells  was  determined  by  microscopic 
examination. 

Radiolabeling  of  Cells  and  Immunoprecipitation  of  HIV- 1  Proteins 
— Transfected  HeLa-tat  cells  or  infected  SupTl  were  suspended  for  1  h 
at  37  °C  in  cysteine-free  media.  [^®S]Cysteine  was  then  added  to  a  final 
concentration  of  200  pCi/ml,  and  the  cells  were  incubated  for  18  h  at 
37  °C. 

After  labeling  the  cells  were  washed  in  phosphate-buffered  saline 
and  disrupted  in  phosphate-buffered  saline  containing  0.5%  NaCl,  1% 
Triton  X-100,  0.5%  sodium  deoxycholate,  and  0.1%  sodium  dodecyl  sul¬ 
fate  (SDS).  The  lysates  were  absorbed  for  3  h  at  4®C  with  protein 
A-Sepharose  and  protein  A-Sepharose  bound  to  rabbit  anti-mouse  Light 
chain  antiserum  and  then  clarified  by  ultracentrifugation.  Radioimmu- 
noprecipitation  analyses  were  performed  by  the  addition  to  0.5  ml  of 
labeled  extract  of  either  10  pi  of  serum  from  an  HIV-infected  individual 
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1.  Scheme  depicting  the  sequence  for  the  V3  loop  of  the 
pHXB2  molecular  clone  of  HIV-Iuib  and  the  location  of  the  Thr 
(LW  virtis  from  1987)  for  A  (HXB2)  substitution  Also  designated 
are  the  binding  sites  for  M77  and  0.50  monoclonal  antibodies. 
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and  0.2  ml  of  a  10%  suspension  of  protein  A-Sepharose  or  1  pi  of  ascites 
containi^the  appropriate  antibody  and  0.2  ml  of  a  10%  suspension  of 
protein  A-Sepharose  bound  to  rabbit  anti-mouse  light  chain  antiserum 
The  ^ples  were  incubated  for  18  h  at  4  <>0.  Immunoprecipitates  were 
collected  by  centrifugation,  washed  repeatedly,  resuspended  in 
Laemmh  sample  buffer  (Laemmli,  1970),  heated  for  2  min  at  90  ®C,  and 
analysed  by  SDS- polyacrylamide  gel  electrophoresis. 

Indirect  Immunofluorescence  and  Fluorocytometric  Analysis 
— Indirect  immunofluorescence  analysis  was  performed  on  live  cells 
with  primary  antibodies  at  5  pg/ml  and  with  fluorescein  isothiocyanate- 
conjugated  goat  antiserum  to  murine  IgG  (Sigma)  as  a  second  layer 
antibody.  Controls  were  incubated  with  a  mouse  IgGi  irrelevant  anti¬ 
body,  then  stained  with  the  same  fluorescein  isothiocyanate-conjugated 
antiserum.  After  staining,  the  cells  were  fixed  with  2%  paraformalde¬ 
hyde  and  analyzed  with  a  Facscan  analyzer  (Becton  Dickinson  Immu- 
nocytometry).  The  results  are  expressed  as  arbitrarily  normalized  his¬ 
tograms  ii.e.  relative  number  of  cells  versus  fluorescence  intensity).  At 
least  10,060  events  were  accumulated  in  all  the  experiments. 

Construction  ofV3  Loop  Cassette  and  env  Ckimerae — A  2.7-kbp  Sall- 
BamHl  restriction  enzyme  fragment  from  pHXB2gp^,  a  molecular  clone 
of  HIV-Iuib,  containing  the  V3  loop  region,  was  subcloned  into  the 
cognate  sites  of  gGEM4  (Promega,  Madison,  WI).  A  0.6-kbp  V3-contain- 
ing  BglU  fragment  from  this  subclone  was  fixrther  subcloned  into  pSP72 
(Promega).  Two  PCR  amplifications  were  performed  on  this  fragment, 
such  that  one  amplified  fragment  consist^  of  the  region  from  the  5' 
Bglll  site  to  the  5'  terminus  of  the  V3  loop  and  contained  Mlul  and  Pstl 
sites  introduced  at  the  downstream  end  of  the  firagment  by  a  tail  on  the 
antisense  PCR  primer.  The  other  PCR  fragment  consisted  of  the  region 
from  the  3'  end  of  the  V3  loop,  where  Pstl  and  Hpal  sites  were  intro¬ 
duced  at  the  upstream  end  of  the  fragment  by  a  tail  on  the  sense  PCR 
primer.  The  PCR  fragments  were  purified,  digested  with  BglU  plus  Pstl, 
and  coligated  into  pSP72.  The  0.5-kbp  BglU  fragment  from  the  resxilt- 
ant  clone  was  ligated  with  the  5.1-kbp  BglU  fragment  of  the  Soil- 
BamKl  env  subclone  of  pHXB^gp^.  The  resulting  plasmid  ha«  unique 
Mlul  and  Hpal  restriction  sites,  whose  introduction  did  not  alter  the 
amino  acid  sequence,  forming  a  cassette  into  which  oligonucleotides 
representing  V3  coding  regions  and  containing  an  Mlul  4-base  over¬ 
hang  at  one  end  and  a  blimt  end  at  the  other  end  can  be  ligated.  This 
construction  is  summarized  in  Fig.  4.  The  V3  loop  coding  region  of 
LW12.3  (Lori  et  al.,  1992),  an  infectious  molecular  clone  from  t  virus 
isolated  in  1987  from  a  laboratory  worker  accidentally  infeci.  a  with 
HIV-Iiiib  and  hereinafter  designate  LW87,  was  synthesized  as  a  set  of 
four  partially  overlapping  nucleotides,  phosphorylated  with  T4  kinase, 
annealed,  and  ligated  into  the  Mlul  and  Hpal  sites  of  the  cassette 
plasmid.  The  resultant  2.7-kbp  So/I-flumHI  firagment,  now  containing 


Fig.  2,  A,  inhibition  of  syncytia  forma¬ 
tion  between  HIV-lniB  (panels  1  and  2)  or 
LW87  ( panels  3  and  4 )  infected  and  unin¬ 
fected  SupTl  cells.  The  assays  were  per¬ 
formed  without  the  addition  (panels  1 
and  3 )  and  in  the  presence  of  M77  ( panels 
2  and  4)  as  described  under  *£xperimen- 
tal  Procedures.”  B,  neutralization  assays 
with  cell-fi^  virions  from  Hiy-lmw  O, 
HXB2  (0),  and  LW87  (■)  were  performed 
with  the  indicated  dilutions  of  M77  asci¬ 
tes  as  described  under  “Experimental 
Procedures.”  The  neutralizing  activity  of 
the  antibody  is  expressed  as  the  percent 
reduction  of  reverse  transcriptase  levels 
in  the  supernatants  of  antibody-treated 
wells  compared  with  those  of  the  control. 


25896 


HIV-1  V3  Loop  Neutralization  Escape  Mutant  in  Vivo 


A 


Cells 


1  2  3 


gpi60 — ► 
gp120 — ► 


Virus 
1  2  3 


B 


Cells  Virus 

12  3  12  3 


i 


c 

Cells 


1  2  3 


Fig,  3.  Reactivity  of  M77  and  0,6^  monoclonal  antibodies  with 
gpl20.  SupTl  cells  infected  with  HIV-Ihib  (A)  or  LW87  (B)  and  HeLa- 
tat  cells  transfected  with  pHXB2  {env  LW85-2)  (C)  were  labeled  with 
[“S]cysteine.  The  clarified  lysates  from  cells  and  virus  containing  su¬ 
pernatants  were  then  immunoprecipitated  with  an  HIV-1  positive  hu¬ 
man  serum  {lane  i),  M77  {lane  2),  and  0.50  {lane  3). 


the  V3  loop  of  LW87,  was  reintroduced  into  pHXB^pr,  which  was  then 
used  as  a  source  for  infectious  chimerae  of  HXB2  with  the  LW87  V3  loop 
(HXB2  (V3LW87)).  Infectious  clones  of  HXB2  containing  V3  loops  into 
which  Ser  or  lie  replaced  A  in  the  21  position  were  constructed  in  the 
same  way  as  HXB2  (V3LW87)  except  for  the  use  of  shghtly  different 
oligonucleotides  bearing  the  appropriate  changes. 

DNA  Transfection — The  infectious  molecular  clones  pLW87,  pHXB2, 
the  chimeric  construct  of  pHXB2  with  the  LW85-2  env  (HXB2  ienv 
LW85-2)),  and  pHXB2  with  the  LW87  loop  (HXB2  rV3LW87])  were 
transfected  into  HeLa-tat  cells  by  calcium  phosphate  coprecipitation 
(Chen,  1987).  Briefly,  an  appropriate  mixture  of  DNA,  sterile  water  and 
CaCl2  was  added  dropwise,  to  a  2  x  solution  of  274  mM  NaCl,  10  mM  KCl, 
1.5  mw  Na2HP04-7H20,  12  mM  dextrose,  and  42  mM  Hepes,  pH  7.1. 
while  applying  an  airstream  on  the  surface.  After  10  min  the  mixture 
was  pipetted  gently  into  a  flask  of  log  phase-growing  HeLa-tat  and 
incubated  with  the  cells  overnight  at  37  °C.  The  next  morning  the  cells 
were  washed  two  times  in  serum-free  medium,  then  fed  with  complete 
medium.  Twenty-four  hours  after  transfection,  the  cells  were  starved  in 
cysteine-free  medium  in  preparation  for  the  metabolic  labeling. 

Peptide  Binding — A  peptide  enzyme-linked  immunosorbent  assay 
(Robert-Guroff  et  al.,  1992)  was  used  to  monitor  antibody  binding  to  V3 
loop  peptides.  The  peptides  were  obtained  from  Multiple  Peptide  Sys¬ 
tems,  Richmond,  CA  and  included  the  central  24  amino  acids  of  LW87 
(^^^TRKRIRIQRGPGRTFVTIGKIG<C))  and  BHIO  (HIV-1, „b) 
(NNTRKSIRIQRGPGRAFVTIGKIG(C))  V3  loop  sequences.  A  peptide 
representing  amino  acids  88-115  of  the  HIV- 1,  jib  gag  sequence 
(VHQRIEIKDTKEALDKIEEEQNKSKKKA)  served  as  negative  con¬ 
trol.  Peptides  were  bound  at  2  jig/well  to  96- well  Immulon  I  microtiter 
plates  (Dynatech,  Chantilly,  VA)  pretreated  with  polylysine  (5  pg/well) 
and  0.1%  giutaraldehyde.  Specific  binding  of  serial  dilutions  of  mono¬ 
clonal  antibodies  were  detected  using  a  goat  anti-mouse  IgG  peroxidase 
conjugate. 

Molecular  Modeling— The  methodology  used  for  the  molecular  mod¬ 
eling  involved  the  following  steps.  First,  the  amino  acid  sequence  of  the 
V3  loop  was  converted  into  secondary  structural  state  using  a  prediction 
algorithm  (Deleage  and  Roux,  1989;  Gupta  and  Myers,  1990)  so  that 
each  amino  acid  was  assigned  one  of  the  four  secondary  states,  Le.  helix 
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Fig,  4,  Derivation  of  chimeric  viruses.  The  construction  of  the  cassette  for  insertion  of  foreign  V3  loops  into  the  infectious  clone  pHXB^gp^, 
described  under  “Experimental  Procedures,”  is  shown  above.  The  Sa/I-BamHI  2.7-kbp  fragment,  containing  most  of  the  env  gene,  was  subcloned. 
ABg/II  firagment  containing  the  V3  region  was  subcloned,  altered  by  PCR  to  replace  the  V3  loop  coding  region  with  cloning  sites,  and  put  back  into 
the  Soil-BamHI  fragment.  The  resultant  Sall-BamHl  construct  was  used  as  the  cassette  and  inserted  back  into  pHXB^gpt  after  introduction  of 
the  LW87  V3  loop. 
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Fig.  5.  Fluorocytometric  analysis  of 
CEM  cells  infected  with  HXB2  or 
HXB2  (V3LW87).  All  the  monoclonal  an¬ 
tibodies  used  are  directed  toward  the 
gpl20  external  env  glycoprotein;  M90  has 
a  group-specific  reactivity,  whereas  both 
M77  and  0.5^  are  t3T>e-specific  for  the 
HIV- line  isolate.  The  empty  profiles  rep¬ 
resent  the  reactivity  of  the  cells  with  an 
irrelevant  isotype-matched  monoclonal 
aiitibody. 
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(h),  beta  (^),  coil  (c),  or  turn  it).  Second,  appropriate  conformation  do¬ 
mains  in  the  (<t>,  'l^)-space  were  then  assigned  to  each  of  the  36  amino 
acids  corresponding  to  their  secondary  structural  states.  Third,  the 
local  energy  minima  of  the  V3  loop  were  sampled  imder  the  constraint 
of  the  S-S  bridge  between  the  2  invariant  cysteines  in  the  V3  loop  (Fig. 
I).  Sampling  was  done  by  combining  Monte  Carlo-simulated  annealing 
(Kirkpatrick  ci  a/.,  1983)  and  energy  minimization  (Fletcher,  1984).  The 
force  field  of  Sippl  et  al.  (1984)  was  used  for  Monte  Carlo  and  energy 
minimization  calculations. 

RESULTS 

Failure  of  an  Anti^HIV-lms  gpl20  Monoclonal  Antibody  to 
Neutralize  Infection  by  LW87 — The  monoclonal  antibodies  cho¬ 
sen  as  probes  are  designated  M77  and  0.50.  They  react  with 
HIV-linB  gpl20  in  both  its  glycosylated  and  unglycosylated 
form  and  were  shown  to  inhibit  syncytia  formation  and  infec¬ 
tion  by  cell-free  HIV-lmB  virions  in  a  type-specific  manner  (Pal 
et  aL,  1992;  Matsushita  et  al.y  1988).  By  examining  a  number  of 
gpl20-derived  peptides,  the  epitope  recognized  by  M77  was 
found  to  map  between  amino  acids  12  and  25  (Pal  et  aL,  1992) 
of  the  sequence  for  the  V3  loop  of  the  HXB2  molecular  clone  of 
HIV-lniB  as  depicted  in  Fig.  1.  0.50  was  reported  previously  to 
bind  between  amino  acids  16-29  of  the  same  sequence  (Skinner 
et  al.,  1988).  Thus,  the  epitopes  for  the  two  antibodies  are 
partially  overlapping. 


We  compared  the  ability  of  these  antibodies  to  neutralize 
infection  by  HlV-lnis,  HXB2,  and  LW12.3  (Lori  et  al.,  1992),  an 
infectious  molecular  clone  derived  from  a  biological  clone  (no. 
12)  of  the  virus  isolated  in  1987  from  a  laboratory  worker  ac¬ 
cidentally  infected  with  HW-Iihb  hi  1985  (Weiss  et  al.,  1988) 
and  hereinafter  designated  LW87.  The  only  amino  acid  differ¬ 
ence  within  the  M77  and  0.50  binding  sites  between  LW87  and 
the  HXB2  resides  at  position  21  in  the  sequence  of  the  V3  loop, 
where  an  alanine  has  been  substituted  for  a  threonine  (Fig.  1). 
This  substitution  was  initially  detected  1  year  after  infection 
and  persisted  in  all  subsequent  virus  isolations.^  As  expected, 
M77  completely  blocked  the  formation  of  syncytia  and  neutral¬ 
ized  the  infectivity  of  both  the  IIIB  and  HXB2  viruses  (Fig.  2). 
However,  M77  did  not  inhibit  the  formation  of  syncytia  between 
SupTl  cells  infected  with  LW87  and  uninfected  SupTl  (Fig. 
2A).  Moreover,  M77  failed  to  neutralize  cell-free  infection  of 
CEM  cells  by  the  same  virus  (Fig.  2B ).  In  contrast,  0.50  inhib¬ 
ited  infection  by  LW87  in  both  experiments  (data  not  shown). 

Failure  ofM77  to  Recognize  Native  gpl20  from  LW87 — Since 
lack  of  neutralization  is  not  necessarily  associated  with  lack  of 


^  M.  S.  Reitz,  Jr.,  L.  Hall,  M.  Robert-Guroff,  J.  Lautenberger,  B.  M. 
Hahn,  G.  M.  Shaw,  L.  I.  Kong,  S.  H.  Weiss,  D.  Waters,  R.  C.  GaUo,  and 
W.  Blattner,  submitted  for  publication. 
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immunological  recognition,  we  performed  radioimmunoprecipi- 
tation  assays  to  determine  if  M77  was  able  to  react  with  LW87 
gpl20  in  its  native  form.  SupTl  cells  infected  with  HIV-Ihib 
and  LW87  were  metabolically  labeled  with  [®®S]cysteine  over¬ 
night.  Labeled  proteins  from  cellular  and  viral  lysates  were 
then  immunoprecipitated  with  M77,  O.Sp,  and  an  HIV-1  anti¬ 
body-positive  human  serum  as  a  positive  control. 

M77  efficiently  precipitated  gpl20  from  both  cellular  and 
viral  extracts  of  HIV-li„B-infected  cells,  but  failed  to  do  so  with 
both  cellular  and  viral  extracts  of  LW87  infected  cells  (Fig.  3,  A 
and  B).  In  contrast,  0.50  immunoprecipitated  gpl20  from  ex¬ 
tracts  of  both  types  of  infected  cells  (Fig.  3,  A  and  B).  Similar 
results  were  also  obtained  by  live  cell  immunofluorescence  la¬ 
beling  followed  by  fluorocytometric  analysis  (data  not  shown). 
Thus,  lack  of  neutralization  of  LW87  by  M77  is  explained  by  the 
inability  of  the  antibody  to  react  with  native  gpl20  from  this 
virus.  These  results  also  suggest  that  antibodies  0.50  and  M77, 
despite  partially  overlapping  epitopes  in  the  V3  loop,  differ  in 
their  binding  specificity  since  0.50  reacted  with  native  ^120 
from  the  LW  virus  isolated  in  1987.  However,  a  decrease  in  the 
efficiency  of  immunoprecipitation  of  gpl20  from  LW87  virus 
was  noticeable  for  0.50,  suggesting  a  lowered  binding  affinity 

for  its  epitope.  . 

Tb  substantiate  and  extend  the  observe-tion  obtamed  with 
the  molecular  clone  LW87,  we  repeated  the  previously  de¬ 
scribed  experiments  with  a  biological  clone,  designated  LW87 
no.  17,  obtained  from  the  same  1987  LW  isolate  grown  in  H9 
cells.  Again  in  contrast  to  0,50,  M77  failed  to  neutralize  infec¬ 
tion  or  to  recognize  gpl20  from  LW87  no.  17-infected  cells  in 
both  radioimmunoprecipitation  and  live  cell  immunofluores¬ 
cence  labeling  followed  by  fluorocytometric  analysis  (data  not 
shown),  demonstrating  that  an  independent  biological  clone 
had  lost  the  same  neutralizing  epitope. 

lb  determine  whether  M77  recognized  gpl20  from  an  earlier 
LW  virus  isolated  shortly  after  infection  in  1985  (LW85),  which 
has  the  same  V3  loop  as  the  molecular  clone  HXB2,  we  tested 
a  chimeric  clone  with  the  complete  env  of  a  clone  of  LW85 
inserted  in  HXB2  (HXB2(eni;  LW85-2),  a  kind  gift  by  Drs.  Be¬ 
atrice  Hahn  and  Oorge  Shaw,  University  of  Alabama,  Bir¬ 
mingham,  AL).  HeLa-tat  cells  were  transfected  with  HXB2(eno 
LW85-2)  and  labeled  with  [®®S]cysteine.  Both  M77  and  0.50 
efficiently  immunoprecipitated  HXB2(eno  LW85-2)  gpl20  from 
labeled  cellular  and  viral  extracts  (Fig  3C).  Moreover,  both 
monoclonal  antibodies  were  able  to  completely  neutralize  the 
infectivity  of  this  chimeric  virus  (data  not  shown). 

The  Ala  -*  Thr  Change  Is  Crucial  for  Neutralization  Resist¬ 
ance  and  Recognition  of  Native  gpl20  by  M77 — In  addition  to 
the  Ala  Thr  replacement  in  the  hypervariable  loop,  the  LW 
virus  isolated  in  1987  is  divergent  from  HXB2  in  more  than  10 
positions  throughout  the  gpl20  sequence.  To  determine 
whether  the  Ala  -»  Thr  substitution  was  critical  for  the  im¬ 
munological  reactivity  and  neutralizing  capability  of  M77,  we 
inserted  the  loop  sequence  of  LW87  into  HXB2,  as  described 
in  detail  under  “Experimental  Procedures”  and  shown  in 
Fig.  4.  The  resulting  infectious  chimera  was  designated 
HXB2(V3LW87).  HeLa-tat  cells  were  transfected  with 
HXB2(V3LW87)  or  with  HXB2  as  wild  type  control  and  cocul¬ 
tivated  with  uninfected  CEM  cells  36  hours  after  transfection. 
Approximately  3  days  later,  when  formation  of  syncytia  was 
readily  visible,  CEM  cells  were  collected  and  subjected  to  live 
cell  immunofluorescence  labeling  followed  by  fluorocytometric 
analysis.  The  0.50  antibody  stained  the  surface  of  both 
HXB2(V3LW87)  and  HXB2-infected  cells  as  did  the  positive 
control  M90,  another  monoclonal  antibody  which  reacts  with 
an  exposed  conserved  epitope  in  gpl20  unrelated  to  the  V3  l(Mp 
(Fig.  5).  In  contrast,  the  M77  antibody  positively  reacted  with 
wild  type  HXB2-infected  cells,  but  failed  to  stain  the  surface  of 


HXB2(V3LW87)  infected  cells.  These  results  clearly  indicated 
that  the  Thr  for  Ala  change  was  sufficient  to  determine  the  loss 
of  the  M77  epitope,  ruling  out  that  other  amino  acid  changes 
between  HXB2  and  LW87  outside  the  V3  loop  were  necessary 
for  loss  of  recognition  of  the  epitope  by  M77 . 

Recognition  of  Linear  V3  Loop  Peptides  by  M77—Ab  men¬ 
tioned  above,  the  only  amino  acid  difference  between  the  LW87 
and  HXB2  V3  loops  consists  of  the  substitution  of  a  Thr  for  an 
Ala  at  position  21.  When  we  assayed  the  ability  of  M77  and  0.50 
to  bind  in  enzyme-linked  immunosorbent  assay  to  V3  linear 
peptides  containing  either  Ala  or  Thr  at  that  position,  both 
antibodies  bound  to  the  two  peptides  (Fig.  6).  These  results 
indicated  that  M77  was  still  able  to  bind  to  the  linear  form  of 
the  epitope  and  suggested  that  local  change  in  conformation  of 
the  epitope  in  the  functionally  folded  protein  resulted  in  the 
loss  of  neutraJization. 

Molecular  Modeling  and  Experimental  Validation  of  Theo¬ 
retical  Predictions— Molecular  modeling  studies  using  a  simu¬ 
lated  annealing  approach  were  performed  in  order  to  under¬ 
stand  the  effect  of  the  Ala  -*  Thr  substitution  on  M77  antibody 
binding.  The  folded  forma  of  the  V3  loop  for  the  Ala  and  "Dir 
analogs,  as  well  as  the  extended  forms,  were  computed  using 
the  methodology  outlined  under  “Experimental  Procedures.” 
Molecular  modeling  studies  indicated  that  the  Ala  -►  Thr  sub¬ 
stitution  can  disrupt  the  folded  epitope  at  the  tip  of  the  V3  loop 
(Figs.  7  and  8).  Fig.  7,  A  and  B,  shows  ribbon  and  skeleton 
models  of  the  V3  loop  for  the  Ala  analog.  In  this  folded  motif 
residues  16-28  of  the  V3  loop,  which  contain  the  crucial  s^ 
quence  for  M77  binding,  form  a  compact  protruding  surface  in 
which  the  A  at  position  21  resides  in  the  interior  of  the  contact 
surface  and  makes  close  contact  with  atoms  of  the  neighboring 
residues  (Fig.  7B).  Thus,  the  Ala  ->  Thr  substitution  should 
result  in  an  enlargement  of  the  interior  to  accommodate  the 
bulkier  side  chain.  This,  in  turn,  should  enlarge  the  contact 
surface  of  the  16-28  motif  with  a  resultant  loss  of  M77  binding. 

The  relative  stability  of  the  folded  form  of  the  V3  loop  was 


Fig.  6.  Antibody  binding  to  V3  loop  peptides.  Peptides  from 
BHIO  (A),  LW  1987  (•)  V3  loops  and  an  irrelevant  peptide  (■)  were 
coated  on  plates  and  reacted  with  M77  (A)  and  0.50  (B).^ Sequences  of 
the  peptides  are  given  under  “Experimental  Procedures.” 
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Fig.  8.  The  ribbon  (A)  and  skeleton  (B)  models  of  the  V3  loop  of  tHe  Thr  analog  in  the  extended  conformation.  This  conformation  can 
also  be  adopted  by  the  Ala  analog.  This  conformation  is  taken  as  the  reference  (unfolded  state).  The  relative  stabilities  of  the  folded  states  of  the 
Ala  and  Thr  analogs  are  measured  from  this  reference  state.  In  the  extended  conformation,  residues  16-28  occupy  a  larger  surface  area  than  in 
the  folded  one. 
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Fig.  9.  Fluorocytometric  analysis  of  CEM  cells  infected  with  HXB2  mutants.  These  are  as  follows:  HXB2  bearing  Ala  (Ala^^),  Ser  (Ala^^ 
Ser),  Thr  (Ala^^  -♦  Thr),  or  He  (Ala^^  ^  He)  at  position  21  in  the  V3  loop.  The  empty  profiles  represent  the  reactivity  of  the  cells  with  an  irrelevant 
isotype-matched  monoclonal  antibody. 


failed  to  react  with  cells  infected  by  HXB2  containing  Thr 
(Ala^^  ^  Thr)  or  lie  (Ala^^  He)  at  that  position  (Fig.  9),  Thus, 
the  M77  reactivity  perfectly  correlated  with  the  modeling  pre¬ 
dictions. 

DISCUSSION 

The  availability  of  sequential  viral  isolates  firom  a  laboratory 
worker  accidentally  infected  with  the  prototype  HIV-1  strain 
IIIB  gave  us  the  unique  opportunity  of  studying  the  influence  of 
naturally  occurring  mutations  on  V3  loop-dependent  neutral¬ 
ization.  Different  molecular  clones  of  LW  obtained  shortly  after 
infection  were  genetically  almost  indistinguishable  among 


themselves  and  from  HIV- ImPt-derived  molecular  clones.^  In¬ 
deed,  within  the  V3  region,  they  completely  retained  the  se¬ 
quence  of  the  HXB2  clone  from  HIV-Iiub.  An  Ala  to  Thr  sub¬ 
stitution  within  the  GPGRAF  motif  at  the  tip  of  the  loop  was 
detected  1  year  after  infection  and  persisted  in  all  subsequent 
isolations.^ 

We  have  observed  that  the  HIV-lmB-neutralizing  mono¬ 
clonal  antibody  M77  was  unable  to  neutralize  infection  by  a 
biologically  active  molecular  clone  of  LW  isolated  in  1987  and 
bearing  the  Thr  for  Ala  substitution  at  position  21  within  the 
V3  loop.  Lack  of  neutralization  does  not  always  correlate  with 
lack  of  immunological  recognition,  since  it  has  been  shown  that 
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monoclonal  antibodies  can  bind  the  V3  loop  without  neutral¬ 
ization  (Nara  et  aL,  1992).  In  this  case,  however,  lack  of  neu¬ 
tralization  by  M77  was  a  direct  consequence  of  the  inability  of 
the  antibody  to  bind  native  gpl20,  as  indicated  by  immunopre- 
cipitation  and  fluorocytometric  analysis. 

The  antibody  M77  did  not  react  with  the  chimeric  virus  ob¬ 
tained  after  insertion  of  LW87  V3  loop  in  HXB2,  clearly  indi¬ 
cating  that  the  Ala  ^  Thr  substitution  is  sufficient  for  the 
abrogation  of  antibody  binding  to  native  gpl20.  Thus  far,  only 
one  neutralization  resistant  escape  mutant  selected  in  vitro 
with  monoclonal  antibodies  to  the  V3  loop  showed  an  amino 
acid  change  within  the  binding  site  of  the  selecting  antibody 
{McKeatinge^  a/.,  1989).  This  change  involved  the  substitution 
of  a  G  for  an  R  at  position  13  in  the  sequence  for  the  V3  loop  and 
clearly  affected  the  binding  of  the  antibody  to  the  linear  epit¬ 
ope.  In  contrast,  the  majority  of  the  neutralization-resistant 
mutants  which  were  selected  in  vitro  with  monoclonal  antibod¬ 
ies  to  the  V3  loop  did  not  show  any  amino  acid  sequence  vari¬ 
ation  within  the  loop  itself,  suggesting  that  critical  changes 
affecting  virus  neutralization  reside  outside  the  V3  domain 
(McKeating  et  aL,  1989).  This,  in  fact,  has  been  proven  in  the  in 
vitro  immune  escape  system  with  human  antisera  (Robert- 
Guroff  et  at.,  1986),  in  which  a  point  mutation  in  the  trans¬ 
membrane  portion  of  the  env  gene  resulted  in  neutralization 
resistance  (Reitz  et  al.,  1988).  Subsequent  studies  showed  that 
this  mutation  affected  neutralization  by  conformational 
changes  induced  by  an  Ala  ->  Thr  change  at  position  582  in 
gp41  (Wilson  et  aL,  1990). 

These  in  vitro  findings  were  also  supported  by  results  gen¬ 
erated  in  vivo  in  chimpanzees  (Nara  etaL,  1990).  Experimental 
infection  of  chimpanzees  with  HIV-linB  resulted  in  the  rapid 
emergence  of  mutant  viruses  resistant  to  neutralization  by  V3- 
specific  neutralizing  antibodies.  Sequencing  of  the  envelope 
gene  from  these  viruses  revealed  the  complete  identity  of  the 
V3  amino  acid  sequence  between  the  neutralization  resistant 
mutant  viruses  and  the  neutralization-sensitive  molecular 
clones  contained  in  the  original  inoculum.  However,  several 
amino  acid  substitutions  occurred  throughout  gpl20,  indicat¬ 
ing  again  that  the  changes  affecting  virus  neutralization  re¬ 
sided  outside  the  V3  loop.  In  contrast,  our  data  show  that  the 
mutation  of  a  single  residue  within  the  V3  loop  occurring  in 
vivo  in  a  human  results  in  the  loss  of  the  binding  site  for  a 
V3-specific  neutralizing  antibody. 

The  antibody  M77  was  able  to  bind  to  the  linear  form  of  its 
epitope,  implying  that  a  local  change  in  conformation  was  ac¬ 
tually  responsible  for  the  loss  of  the  epitope  itself  in  the  func¬ 
tionally  folded  protein.  It  is  conceivable  that  M77  reacts  with 
both  isolated  V3  peptides  because  these  are  flexible,  do  not 
have  structural  constraints,  and  thus  have  more  freedom  to 
adapt  to  the  antibody  complementarity  region.  However,  the 
M77  antibody  can  only  bind  the  (S-S>-bridged  V3  loop  Ala  ana¬ 
log  but  not  the  V3  loop  Thr  analog.  This  means  that  neither  Ala 
nor  Thr  at  the  same  position  of  the  V3  loop  make  direct  contact 
with  the  antibody.  If  this  were  the  case,  the  M77  antibody 
would  have  discriminated  between  the  Ala  and  the  Thr  analogs 
in  the  linear  peptide  epitopes.  Therefore,  it  appears  that  the 
(S-S)-bridged  V3  loop  imposes  certain  stereochemical  con¬ 
straints  such  that  only  the  Ala  analog,  but  not  the  Thr  analog, 
can  present  the  antibody-binding  domain  of  the  V3  loop  in  an 
effective  manner  to  the  M77  antibody. 

Our  molecular  modeling  studies,  experimentally  supported 
by  different  amino  acid  replacements,  suggested  that  the  Thr 
for  Ala  substitution  can  disrupt  the  folded  antigenic  epitope  at 
the  tip  of  the  V3  loop  (amino  acids  16-28),  In  the  folded  form 
the  amino  acids  forming  the  epitope  adopt  a  protruding  surface, 
in  which  the  Ala^^  residue  resides  in  the  interior  of  the  contact 
surface,  while  the  Ala  Thr  substitution  requires  enlargement 
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of  the  contact  surface  resulting  in  the  loss  of  antibody  binding. 

This  report  demonstrates  that  the  V3  loop  can  adopt  a  spe¬ 
cific  structure  in  the  presence  of  the  antibody.  Both  from  sec¬ 
ondary  structure  prediction  algorithms  and  two-dimensional 
NMR  studies  on  free  V3  loops,  it  has  been  suggested  that  the 
GPGR  sequence  at  the  center  of  the  immunogenic  tip  can  adopt 
a  type  II  3-tum  (LaRosa  et  a/,,  1990;  Chandrasekhar  et  al, 
1991).  However,  no  evidence  has  been  presented  to  date  that 
the  GPGR  sequence  together  with  the  flanking  amino  acids  can 
form  a  well  defined  combining  site  for  antibodies.  Amino  acids 
on  the  surface  of  this  combining  site  provide  direct  contact  with 
the  antibody  binding  pocket,  whereas  the  amino  acids  in  the 
cavity  determine  the  size  and  shape  of  the  cavity  itself  while 
maintaining  a  specific  geometry  of  the  antibody  contact  domain 
of  the  V3  loop. 
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Abstract 

Objectives:  The  immunological  properties  of  a  panel  of  human  mucin  MUCl/HIV 
V3  loop  chimeras  is  evaluated. 

Design:  The  immunodominant  epitope  of  MUCl  (APDTR)  was  found  to  be 
structurally  isomorphous  with  the  tip  of  the  principle  neutralizing  determinant 
(PND)  of  HIV-1  (MN)  (GPGRA).  A  panel  of  120  residue,  6  tandem  repeat  and  60 
residue,  3  tandem  repeat  antigens  chimeric  antigens  were  constructed  in  which  the 
repeating  MUCl  epitope  is  replaced  by  the  principle  neutralizing  determinant  of 
HIV-1.  Each  20  residue  tandem  repeat  contains  one  PND  epitope.  The  PND  of  HTV- 
1  is  presented  in  the  native  P-turn  conformation  at  the  crest  of  each  repeating  knob 
structure  of  the  mucin-like  protein. 

Methods:  The  antigenicity  of  the  chimeric  antigens  are  compared  using  ELISA  and 
HIV  infected  patient  sera.  Structural  effects  of  antibody  -antigen  interactions  are 
determined  using  Surface  Plasmon  Resonance  (SPR),  with  human  monoclonal 
antibodies,  chimeric  antigens  and  the  cyclic  and  linear  V3  loops.  Immunogenicity  of 
3  versus  6  tandem  repeats  is  measured  in  mice. 

Results:  Nine  residues  of  the  HIV  PND  substituted  into  the  mucin  backbone  were 
equivalent  to  the  36  residue  cyclic  V3  loop  in  ELISA.  The  120  residue  antigens 
induced  high  titer,  IgM  and  IgG,  HTV  specific  antibodies  in  mice. 

Conclusions:  MUC1/V3  chimeras  efficiently  detect  HIV  specific  antibodies  in 
patient  sera.  Multivalent  presentation  of  the  PND  is  advantageous  for  higher 
affinity  antibody-antigen  interactions  and  for  inducing  HIV  specific  IgM  and  IgG 
antibodies. 

KEY  WORDS:  Human  Mucin  MUCl,  HIV-1,  V3  Loop,  tandem  repeats, 
Immunogenicity,  Antigens,  Surface  Plasmon  Resonance 
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Introduction 

The  external  surface  unit  glycoprotein  (gpl20)  of  the  human 
immunodeficiency  virus  type  1  contains  multiple  disulfide  bonded  loops  [1].  The 
principal  neutralizing  determinant  (PND)  is  located  inside  the  third  variable  loop 
(V3  loop)  of  gpl20  [2-4].  The  common  structural  element  of  the  V3  loop  PND  is 
the  type-n  reverse  turn  (GPGR)  near  the  midpoint  of  the  35-residue  long  [5-10]. 
PND  specific  neutralizing  antibodies  generally  recognize  the  GPGR  turn  and  two  or 
three  flanking  residues,  either  toward  the  C-  or  N-terminus  [11].  Recent  studies 
show  that  this  relatively  conserved  structural  feature  of  the  HTV-l,  PNDs,  is  further 
characterized  by  the  formation  of  a  solvent  accessible  protruding  motif  (designated 
knob),  with  the  type  II  turn  at  the  crest  [7,  8,  12].  However,  during  an  immune 
response,  the  structural  conservation  of  the  PND  may  be  masked  by  the  variable 
flanking  sequences  in  the  V3  loop  of  gpl20. 

Our  work  with  the  structure  of  the  human  mucin  MUCl  tandem  repeat 
domain  supported  the  design  of  protein  constructs  in  which  HIV-1,  PNDs  could  be 
presented  by  this  multivalent  antigen  in  their  "correct"  conformations  [13-16]. 
Humcin  mucin  MUCl  is  a  cell  surface  glycoprotein  with  a  large  tandem  repeat  (TR) 
domain  [17,  18].  The  number  of  tandem  repeats  is  variable,  but  individuals  may 
contain  (20-100)  perfect  copies  of  a  twenty  residue  proline,  serine,  threonine,  glycine 
and  alanine  rich  tandem  repeat  (the  tandem  repeat  sequence  is 
GVTSAPDTRPAPGSTAPPAH)[19.  20].  In  breast,  pancreatic,  and  ovarian  cancers 
the  MUCl  tandem  repeat  sequence  APDTR,  appears  to  be  immimodominant  for 
antibody  binding  specificity,  similar  to  the  GPGR  of  HIV-1,  (MN)  [21-23].  The  high 
resolution  structure  of  MUCl  revealed  repeating  (every  20  residues)  solvent  exposed 
knobs,  with  an  immimodominant  APDTR  type  n  turn  at  the  crest  [15].  We  have 
also  shown  that  isomorphous  replacement  of  the  mucin  knobs,  with  the  HIV-l, 
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PND  knobS/  results  in  mucin-like  HIV-1  antigens  that  retain  the  structural  purity  of 
the  HIV-1  epitope,  the  global  extended  structure  of  mucin,  and  are  recognized  by 
polyclonal  sera  from  HIV-1  infected  individuals  [161. 

In  this  paper,  we  describe  how  the  immunologic  properties  of  six,  seven,  and 
nine  residue  sequences,  which  include  the  P-tum  (GPGR)  of  the  HIV-1  V3  (MN),  are 
enhanced  by  presentation  within  the  mucin  backbone.  In  addition,  we  show  that 
large  multivalent  antigens  (N  =  3,  6)  are  advantageous  over  monovalent  linear 
peptides  for  detecting  HIV-1,  PND  antibodies  in  the  serum  of  HPV  infected 
individuals.  By  studying  the  kinetics  of  HIV  specific  monoclonal  antibodies, 
interacting  with  chimeric  mucin-V3  loop  (mucV3)  multivalent  antigens,  we 
determine  how  the  larger  120  residue  multivalent  antigens  containing  six  antigenic 
structural  knobs  increase  the  affinity  of  the  antigen-antibody  interaction  over  the  60 
residue  antigens  with  three  structural  knobs. 
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Materials  and  Methods 

Antigens 

The  principles  of  design  and  NMR  structural  information  of  the  mucV3  chimeric 
polypeptides  were  described  earlier  [16].  These  include  both  the  60  and  120  residue 
synthetic  peptides  of  the  mucV3  chimeras  (Table  I).  The  chimeras  mucV3-6, 
mucV3-7,  and  mucV3-9  contain  either  six  (IGPGRA),  seven  (HIGPGRA),  or  nine 
(IHIGPGRAF)  residues,  respectively,  of  the  HIV-1  V3  (MN)  loop,  per  twenty  residue 
tandem  repeat.  In  each  antigen,  as  shown  in  Table  I,  mucin  residues  were  removed 
and  replaced  by  HIV-1  residues,  and  the  length  of  the  repetitive  element  was 
maintained  at  20  residues.  The  intention  was  to  maintain  the  overall  mucin 
structure,  but  to  replace  the  immunodominant  knobs  of  mucin  with  the 
immunodominant  knobs  of  the  V3  loop.  The  sequence  of  one  tandem  repeat  for 
human  mucin  MUCl,  mucV3-6  containing  6  PND  residues  per  20  residue  tandem 
repeat,  mucV3-7  containing  7  PND  residues  per  20  residue  tandem  repeat,  and 
mucV3-9  containing  9  PND  residues  per  20  residue  tandem  repeat  are  shown  in 
Table  I.  Notice  that  there  are  two  sizes  of  each  mucV3  chimeric  peptide,  one  with 
three  tandem  repeats  or  60  residues  and  one  with  sbc  tandem  repeats  or  120  residues 
(Table  I).  Additional  V3  (MN)  antigens  evaluated  include  the  full  length  cyclic 
(oxidized)  and  linear  (reduced)  forms  of  the  V3  loop  of  HTV-l  (MN)  ,  and  a  15 
residue  linear  peptide,  designated  1143D,  (DKRIHIGPGRAFYTT).  The  N-terminal 
aspartic  acid  was  added  for  enhanced  peptide  binding  to  microwells. 

Peptide  Synthesis,  Purification  and  Mass  Characterization 

All  peptides,  except  the  15  residue  1143D,  were  peptide  amides  and  were 
synthesized  by  a  manual  solid-phase  strategy  using  9-fluorenylmethyloxycarbonyl 
protected  amino  acids  as  described  in  detail  elsewhere  [14].  Peptide  molecular 
weights  were  characterized  by  electrospray  ionization  mass  spectroscopy  at  the 
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Protein  and  Carbohydrate  Structure  Facility,  University  of  Michigan,  Medical  School 
under  the  direction  of  Dr.  Phil  Andrews.  In  each  case  the  difference  between 
expected  and  observed  peptide  molecular  weights  was  within  the  experimental  error 
of  the  spectrometer.  For  example,  the  expected  molecular  weights  of  the  mucV3-6, 
mucV3-7  and  mucV3-9,  120  residue  peptides  are  10,922,  10994  and  11,380  Daltons, 
and  the  observed  molecular  weights  are  10,916, 10,988  and  11,370  Daltons. 

Patient  Sera,  Monoclonal  Antibodies  and  ELISA  Methods 

A  panel  of  HIV-1  sera  from  patients  living  in  Honduras  was  collected,  characterized 
and  maintained  at  the  Centers  for  Disease  Control  by  the  Division  of  HIV/ AIDS  in 
Atlanta,  GA.  These  patients  are  infected  with  HIV-1  genotypes  characteristic  of 
clade  B  viruses.  The  V3  loop  sequence  of  the  predominant  virus  population  for  22 
of  these  individuals  has  been  determined  [24].  The  HIV-1  (MN)  isolate,  on  which 
the  mucin  V3  chimeric  antigens  are  based,  also  belongs  to  genotype  B,  as  do  the 
majority  of  HIV-1  isolates  found  in  North  America  [25].  In  addition,  we  used 
serum  samples  from  eleven  individuals  known  to  be  infected  with  diverse  non- 
clade  B  HIV-l  subtypes,  these  included  2  from  Rwanda,  4  from  Brazil,  3  from 
Thailand,  and  2  from  Uganda  [24].  The  predominant  V3  loop  sequence  from  all  of 
these  individuak  was  determined  to  be  non  clade  B,  both  by  sequence  and  serology 
[24].  V3  loop  specific  monoclonal  antibodies  447,  412,  453,  386,  268,  782,  257,  391,  838, 
419, 181, 908,  537  were  derived  from  AIDS  patients  at  the  New  York  Veterans  Affairs 
Medical  Center,  New  York,  NY  as  described  previously  [11.  26].  Additional  V3 
monoclonal  antibodies  used  in  this  study  9205  and  9284  [27]  were  obtained  from 
Dupont  (Boston,  MA)  and  mAB  50.1  [28]  was  obtained  from  Repligen  (Cambridge, 
MA).  The  EUSA  method  using  patient  sera  was  described  in  detail  elsewhere  [29]. 
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BIAcore  Measurement  of  Antibody  Binding  and  Affinity 

The  mucV3-9,  60  and  120  residue  peptide  antigens,  were  selected  for  affinity 
measurements  with  V3  mAbs  using  surface  plasmon  resonance  (SPR),  (BIAcore™, 
Pharmacia  Biosensor,  Piscataway,  NJ)  because  these  antigens  showed  maximum 
binding  to  V3  mAb.  Monoclonal  antibodies  412,  447,  and  453  were  chosen  for  this 
experiment  because  their  dissociation  rates  are  faster  than  1  X  10'5  as  required  for 
accurate  measurements.  The  SPR  technology  is  described  in  detail  elsewhere  [30- 
32].  Both  mucV3-9,  60  and  120  residue  peptides  were  covalently  attached  to  a 
dextran  matrix  by  EDC/NHS  chemical  activation  and  coupling  of  free  amines  on  the 
peptide  N-termini  to  activated  carboxyl  groups  on  the  resin.  The  optimal 
immobilization  buffer  was  lOmM  2-(N-morpholino)ethanesulfonic  acid  (MES)  at 
pH  6.0.  The  eluent  buffer  is  lOmM  HEPES,  150mM  NaCl,  3.4mM  EDTA,  .05% 
BIAcore  surfactant  P20,  pH  7.4.  A  flowrate  of  5  ul/min  and  2  ul/min  was  used  for 
the  immobilizations  and  the  binding  experiments  respectively.  The  unreacted  EDC- 
esters  were  blocked  with  an  injection  of  IM  ethanolamine.  The  amount  of  protein 
was  determined  by  the  increase  in  baseline  level  of  plasmon  resonance. 

Kinetic  analysis  was  performed  as  described  [30-33].  Serial  2-fold  dilutions 
of  the  antibodies  in  HBS  buffer  were  injected  across  the  mucV3  matrices  at  5 
ul/min.  Following  the  injection  of  the  antibody  the  dissociation  was  monitored  for 
15  minutes  by  flowing  buffer  over  the  biosensor  matrix.  The  apparent  dissociation 
and  association  rates  were  determined  using  the  equation  dR/dt=-(kiC  -i-  k-i)R  +ki 
CRmax-  is  the  apparent  association  rate  constant  and  k-i  is  the  apparent 
dissociation  rate  constant.  Rmax  is  the  maximum  binding  capacity  of  the  peptide 
immobilized  on  the  dextran  matrix,  R  is  the  amount  of  bound  antibody  (in 
Response  Units  RU)  at  time  t  and  concentration  C.  A  linear  plot  of  R  versus  dR/dt 
yields  a  slope  of  -(kiC  +  k-l)  and  y-intercept  of  kl  CRmax-  dR/dt  is  determined  by 
measuring  the  slope  at  several  points  along  the  antibody-peptide  association  curve. 
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The  slopes  were  plotted  versus  the  antibody  concentration  to  obtain  ki.  k-i  was 
obtained  from  the  equation  In  Rtl/RtO=  k-l(to-ti).  RtO  is  the  response  immediately 
following  the  completion  of  the  antibody  injection  and  Rtl  is  the  response  at  a  later 
time.  A  plot  of  In  Rtl /RtO  versus  (tQ-tl)  yields  apparent  dissociation  rate,  k-i. 


Immunogenicity  of  the  Antigens  in  Mice 

Peptide  immunogenicity  studies  were  performed  using  4  week  old  female 
Balb/c  mice  and  groups  of  4  mice/antigen.  One  set  of  immunizations  used  an 
iiutial  intraperitoneal  injection  that  consisted  of  100  ug  antigen  /50  ul  of  phosphate 
buffer  emulsified  with  50  ul  Freund's  complete  adjuvant,  for  a  total  volume  of 
lOOul/  mouse.  The  initial  injection  was  followed  by  4  subsequent  boosts  of  100  ug  of 
antigen  emulsified  in  Freund's  incomplete  adjuvant  (lOOul  total  volume)  at 
approximately  3  week  intervals.  To  assess  the  role  of  antigen  valency  and  dose  on 
the  antibody  response,  groups  of  4  mice/antigen  were  immunized  intraperitoneally 
using  2, 10,  and  20  nmoles /immunization  of  mucV3-9  (120)  or  mucV3-9  (60)  residue 
peptides.  In  this  experiment,  the  initial  immunization  was  followed  by  3  boosts  at 
approximately  4  week  intervals.  Blood  was  collected  by  tail  bleeds  5-6  days  after 


immunizations. 
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Results 

The  Importance  of  Antigen  Structure 

A  panel  of  human  V3  mAbs  from  infected  patients  were  used  to  assess  the 
importance  of  V3  structure  on  binding  affinity  [11,  34].  The  ability  of  these  human 
mAbs  to  distinguish  between  the  linear  reduced  V3  loop  and  the  cyclic  oxidized 
forms  of  the  HIV-MN  V3  loop  was  evaluated  using  SPR.  The  uncertainty  in 
determining  the  amount  of  bound  antigen,  that  is  a  limitation  in  ELISA,  was 
eliminated  in  the  current  experiment.  Since  the  level  of  each  peptide  bound  to  the 
dextran  matrix  can  be  determined.  Therefore,  the  binding  data  are  scaled  by  this 
amount  and  errors  due  to  differences  in  binding  are  eliminated.  For  example,  the 
cyclic  V3  MN  peptide  coupled  to  the  dextran  matrix  with  an  Re  (831),  and  the  linear 
V3  MN  peptide  coupled  to  the  dextran  matrix  with  Re  (1124).  The  mAb  447  bound 
to  the  matrix  attached  cyclic  V3  MN  with  Re  (3355)  and  to  the  matrix  attached  linear 
V3  MN  peptide  with  Re  (3591).  The  binding  reactivity  ratio  for  mAb  447  for  the 
cyclic  and  linear  forms  of  the  V3  loop  are  3355/831=4.0  for  the  cyclic  peptide  and 
3591/1124=3.2  for  the  linear  peptide.  Therefore,  naAb  447  preferentially  bound  to  the 
cyclic  form  of  the  loop  by  the  following  ratio  4.0/3.2=1.25,  which  is  a  25%  preference. 
In  Table  H,  15  out  of  15  monoclonal  antibodies  showed  a  higher  binding  reactivity 
ratio  to  the  cycUc  V3  loop  (13-27%).  These  data  support  other  observations  showing 
that  antibodies  derived  from  HIV-1  infected  patients  have  a  higher  binding  affinity 
for  the  cyclic  form  of  the  V3  loop  [12]. 

Enhanced  Antigenicity  and  Diagnostic  Potential 

The  ELISA  reactivity  of  a  panel  of  HIV-1  sera  was  used  to  assess  the 
antigenicity  of  tiie  V3  peptide  antigens  in  Table  I.  The  antigenicity  of  the  mucin  V3 
loop  chimeric  antigens  was  compared  to  the  antigenicity  of  the  cyclic  and  linear 
forms  of  the  36  residue  V3  loop,  the  15  residue  linear  HIV-MN  peptide,  and  a  105 
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residue  mucii\  MUCl  peptide.  The  parcel  of  sera  represent  29  patients  from  the 
Honduras  infected  with  clade  B  viruses  characteristic  of  the  North  American 
isolates.  For  comparison,  the  serological  reactivity  of  11  sera  from  patients  infected 
with  viral  strains  whose  PND  sequence  is  non-clade  B  were  measured. 

Serological  reactivity  of  these  HIV-1  sera  with  the  chimeric  antigens  (1/100 
dilution  of  serum)  as  measured  by  ELISA  is  shown  in  Figure  1.  The  data  are 
displayed  in  order  to  emphasize  the  increasing  antigenicity  of  either  six 
(IGPGRA)(Fig.  lA),  seven  (HIGPGRA)(Fig.  IB),  or  nine  (IHIGPGRAF)(Fig.  1C) 
residues  of  the  HIV-V3  loop  when  presented  in  the  context  of  a  multi-valent 
mucin-like  knob.  As  shown  in  Figure  lA,  when  only  6  residues  of  the  PND  are 
used,  65%  of  the  clade  B  sera  tested  positive,  (24%  show  a  strong  reaction  and  41% 
have  an  intermediate  level  of  serological  reactivity).  The  number  of  clade  B  patients 
who  show  a  strong  (52%)  to  intermediate  (24%)  level  of  serological  reactivity 
increases  to  about  76%  when  seven  residues  of  the  V3  loop  are  substituted  into 
MUCl  (Figure  IB).  The  strong  and  intermediate  categories  were  enhanced  by  28% 
and  15%  respectively,  simply  by  adding  a  single  residue  of  the  V3  loop  into  the 
mucin  chimera.  As  shown  in  Figure  1C,  substitution  of  nine  residues  of  the  V3 
loop  into  mucin,  results  in  100%  of  the  clade  B  sera  showing  either  strong  reactivity 
(90%)  or  intermediate  (10%).  Increasing  the  number  of  PND  residues  substituted 
into  the  mucin  backbone  from  7  to  9  results  in  significant  enhancement  of  the 
number  of  sera  in  the  strong  category  at  the  expense  of  the  intermediate  and  weak 
categories  of  serological  reactivity.  The  number  of  non-clade  B  patients  with  strong 
to  intermediate  levels  of  cross  reactivity  to  the  MUCl/HFV-MN  chimeras  increased 
from  one  to  three  as  the  number  of  HIV  residues  in  the  chimera  increased  from  six 
to  nine  (Figures  lA-C).  Only  two  sera,  one  clade  B  with  weak  reactivity  and  one 
non-clade  B  with  intermediate  reactivity,  recognized  the  mucin  negative  control. 


(data  not  shown). 


ELISA  reactivity  of  the  sera  described  above  are  compared  to  the  full  HIV-MN 
cyclic  V3  loop  (Fig.  2A),  the  15  residue  linear  peptide(Fig.  2B),  and  the  mucV3-9-120 
residue  multi-valent  chimera  (Fig.  1C).  Comparing  figures  2A  and  1C  shows  that  a 
multi-valent  mucin-V3  loop  chimera  with  only  nine  residues  of  the  V3  loop 
detected  an  equivalent  level  of  antibody  reactivity  in  patient  serum  (90%  strong, 
10%  intermediate  and  3  cross  reactors)  as  the  entire  36  residue  cyclic  V3  loop.  The 
mucin  chimeras  with  nine  residues  of  the  HIV-V3  loop  (Fig.  1C),  detected  50%  more 
strong  serological  reactions  among  the  clade  B  sera  than  a  15  residue  HIV-MN  linear 
peptide  (Fig.  2B).  Comparing  figures  lA  and  IB  with  2B  reveals  that  mucin 
chimeras  with  6  and  7  residues  of  HIV  sequence  detect  similar  or  higher  amounts  of 
HIV-PND  serological  reactivity  than  the  15  residue  linear  peptide. 

Endpoint  ELISA  titers  of  the  clade  B  HIV-1  serum  against  these  antigens 
reveals  several  additional  features  of  both  the  serum  and  the  antigens  involved. 
Figure  3A  shows  one  example  of  an  individual  clade  B  serum  titered  with  the  cyclic 
and  linear  V3  loops,  the  15  residue  linear  peptide,  and  the  three  mucin-V3  chimeric 
peptides.  The  cyclic  and  linear  V3  loops  displayed  similar  antibody  reactivity  to 
HIV-1  serum,  although  the  cyclic  V3  loop  consistently  detected  slightly  higher  levels 
of  antibody  reactivity  in  all  the  titered  sera  (a  total  of  19  clade  B  sera).  The  pattern  of 
reactivity  seen  in  Figure  3A  for  the  120  residue  mucV3  chimeras  mucV3-6,  mucV3- 
7,  and  mucV3-9  and  the  15  residue  linear  peptide  is  also  characteristic  of  the  19 
titered  sera.  The  chimera  with  nine  HIV  residues  detects  the  same  or  slightly  less 
antibody  reactivity  than  the  full  cyclic  V3  loop  at  all  dilutions.  Although  the 
mucV3-7  chimera  with  7  residues  detects  consistently  less  antibody  reactivity  than 
mucV3-9,  antibody  could  be  detected  are  at  all  dilutions.  The  mucV3-6  chimera 
usuaUy  detected  the  least  amount  of  antibody  reactivity  of  the  chimeric  proteins,  but 
usually  is  very  close  to  the  15  residue  linear  peptide.  One  trend  is  consistently 
observed,  that  the  mucV3-7,  and  mucV3-9  chimeras  detect  antibody  reactivity  at  10 


to  50  fold  higher  dilutions  than  the  15  residue  linear  peptide  or,  at  a  given  dilution, 
much  higher  absorbance  values  are  observed. 

Multivalent  antigens  are  potentially  capable  of  detecting  higher  levels  of 
serum  antibodies  specific  for  the  p-turn  at  the  crest  of  the  V3  loop  than  even  the  full 
length  cyclic  V3  loop  antigen.  Some  of  the  antibody  reactivity  detected  by  the  full 
length  V3  loop  peptides  must  arise  from  epitopes  other  than  the  P-tum  at  the  crest, 
which  could  not  possibly  be  present  in  the  chimeras.  Figure  3B  shows,  that  for  an 
individual  with  a  V3  loop  sequence  at  the  position  of  the  p-turn  (IHMGWGRAFY) 
with  critical  mutations  relative  to  the  MN  sequence  (I  to  M;  P  to  W)  that  eliminate 
all  antibody  binding  to  the  chimeric  proteins,  there  can  be  significant  antibody 
reactivity  to  other  epitopes  within  the  loop.  It  is  conceivable  that  multivalent 
presentation  of  the  p-tum  at  the  crest  of  the  V3  loop  in  the  correct  conformation  by 
the  chimera  could  offer  quantitative  advantages  for  the  creation  of  diagnostic 
peptides. 

Advantages  of  Multi-Valency 

BIAcore  SPR  can  be  used  to  measure  binding  kinetics  of  monoclonal  antibody 
and  HIV-1  chimeric  antigen  interactions  and  to  examine  the  impact  of  multi¬ 
valency  in  antigen-antibody  binding.  The  mucV3-9  antigens  were  used  in  this 
experiment  since  these  show  the  maximum  reactivity  in  ELISA  of  the  mucin 
chimeras  with  HIV  infected  patient  serum.  The  association  (Ri)  and  dissociation 
(R.l)  rate  constants  of  the  antibody-antigen  interaction  mABs  447, 412,  or  453  and  the 
60  and  120  residue  peptides  (3  knobs  versus  6  knobs)  are  summarized  in  Table  HI. 
The  Ri  values  shown  in  Table  HI  for  the  monoclonal  antibodies  447  and  412  binding 
to  mucV3-9  (60)  and  mucV3-9  (120)  are  similar.  However,  the  dissociation  rates  for 
the  monoclonal  antibodies  447  and  412  is  10.4  and  11.0  times  slower  for  mucV3-9 
(120)  than  for  mucV3-9  (60).  As  a  result  these  antibodies  have  a  ten-fold  stronger 


affinity  constant  for  the  higher  valency  antigen.  The  effect  is  less  pronounced  but 
similar  for  the  monoclonal  antibody  453  binding  to  the  60  and  120  residue  peptides. 


Immunogenicity 

Multivalent  binding  has  the  effect  of  changing  the  overall  equilibrium 
constant  for  the  antibody-antigen  interaction.  This  effect  should  also  be  operating 
when  the  antigen  engages  an  immunoglobulin  molecule  on  the  surface  of  a 
lymphocyte  during  the  induction  of  an  immune  response.  The  six  knob,  120 
residue  peptide  was  expected  to  induce  significant  IgM  antibodies  through  a  T-cell 
independent  mechanism  [22.  23].  Table  IV  summarizes  the  results  of  four  100  ug 
immunizations  of  sets  of  4  Balb/C  mice  with  the  mucV3-6  (120),  mucV3-7  (120), 
mucV3-9  (120),  and  mucin  105  residue  proteins.  MucV3-9  (120),  which  proved  to  be 
the  most  effective  chimeric  antigen,  is  also  the  most  immunogenic  in  Balb/C  mice. 
The  mucV3-9, 120  residue  antigen  induced  high  levels  of  IgG  antibodies  as  well  as 
the  expected  IgM  response.  Using  the  ability  to  induce  IgG  and  IgM  antibodies  as  an 
indicator,  the  chimeric  antigens  displayed  the  following  decreasing  immunogenicity 
in  Balb/C  mice  as  shown  in  Table  IV;  mucV3-9  (120)  >  mucV3-6  (120)  >  mucV3-7 
(120).  The  105  residue  mucin  control  protein  only  induced  IgM  antibodies,  which  is 
consistent  with  the  in  vivo  humoral  antibody  responses  to  MUCl  reported  in  cancer 
patients  [22.  23]. 

We  reasoned  that  if  the  high  levels  of  IgG  induced  by  mucV3-9,  120  residue 
peptide  were  the  result  of  the  creation  of  a  T-cell  helper  epitope  in  the  chimeric 
sequence,  then  tiie  response  to  the  60  residue  peptide  should  be  very  similar  to  that 
seen  for  the  120  residue  immunogen.  This  is  because  the  60  residue  molecule 
contains  the  same  sequence  and  structural  information  as  the  120  residue 
molecules,  the  only  difference  is  valency.  The  mucV3-9  (60)  and  120  residue 
peptides  were  each  used  to  immunize  3  sets  of  4  Balb/c  mice.  The  mice  were 
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immunized  with  3  different  antigen  doses  corresponding  to  1,  10,  and  20  nmoles  of 
antigen.  The  ELISA  serological  reactivity  of  the  mouse  sera  to  the  immunizing 
antigens  are  shown  in  Figure  4.  Mice  immumzed  with  mucV3-9  (60),  at  all  doses, 
did  not  induce  IgG  or  IgM  antibody  responses  to  mucV3-9  (60).  In  contrast,  high 
levels  of  IgG  antibody  are  induced  by  mucV3-9  (120)  residue  peptide  at  each  dose. 
We  did  not  observe  detectable  IgG  or  IgM  antibody  responses  to  mucV3-7  (60)  in  100 
ug  immunization  of  Balb/c  mice.  It  appears  that  the  60  residue  antigens  are  non- 
immunogenic  due  to  either  insufficient  valency  to  induce  T-independent  responses 
or  lack  of  T-cell  helper  epitopes. 

The  specificity  of  the  murine  IgG  response  to  the  mucV3-9  (120)  residue 
peptide,  from  mice  immunized  with  10  nmoles  of  mucV3-9  (120),  is  shown  in 
Figure  5.  The  titer  to  the  immunizing  antigen  is  greater  than  1/100,000  after  only  4 
months.  The  titer  to  the  cyclic  V3  loop  titer  is  1/50,000  and  less  than  1/10,000  for  the 
linear  V3  loop  at  4  months.  No  substantial  anti-mucin  reactivity  develops  during 
this  time.  This  data  strongly  supports  the  idea  that  these  antigens  will  be  useful 
tools  for  inducing  T-independent  HIV  specific  antibodies. 


15 


Discussion 

Wc  previously  showed  by  high  resolution  NMR  that  substitution  of  the 
immunodominant  knob  of  human  mucin  MUCl  with  the  corresponding  HIV-PND 
results  in  structural  preservation  of  the  "correct"  HTV  epitope  [16].  The  goal  of  the 
present  study  was  to  examine  the  antigenic,  diagnostic  and  immunogenic 
characteristics  of  these  multi-valent  mucV3  chimeric  antigens.  By  substituting 
increasing  numbers  of  HIV-PND  residues  at  the  equivalent  positions  in  the 
repeating  mucin  knobs,  mucin-like  molecules  with  multiple  HIV-PND  epitopes  in 
their  correct  conformation  were  created.  This  approach  offers  several  theoretical 
advantages:  First,  presentation  of  the  "native"  conformation  of  short  p-turn 
segments  can  be  maintained  by  the  constraints  of  the  MUCl  framework.  Second, 
any  possibility  for  the  PND  epitope  to  be  obscured  by  flanking  hypervariable 
sequences  within  the  V3  loop  is  eliminated.  Third,  the  antigenic  and  immunogenic 
potential  of  the  epitope  is  increased  by  presentation  within  the  multi-valent  MUCl. 
Finally,  mucin  presentation  of  foreign  P-turns  at  the  tip  of  the  knob  assures  that  the 
epitope  in  question  is  located  on  a  surface  that  is  accessible  to  antibody  molecules  in 
solution  or  on  the  surface  of  lymphocytes.  This  approach  can  work  as  long  as  the 
antigen  in  question  forms  a  p-tum  in  the  original  protein  [35,  36).  In  this  example 
we  used  the  PND  of  HIV-1  (MN),  the  residues  GPGR  have  been  shown  by  molecular 
modeling,  crystallography  and  NMR  to  form  a  type  n  p-tum  structure  [5,  7-10].  In 
addition  to  the  PND  of  HIV,  the  immunodominant  segments  of  the 
transmembrane  proteins  (TM)  or  the  tips  of  other  surface  unit  (SU)  protein  loops  of 
lentiviruses  would  make  good  candidates  for  incorporation  into  mucin  chimeras 
[37-41]. 

We  showed  that  human  monoclonal  antibodies  derived  from  HIV  infected 
patients  bind  more  efficiently  to  the  PND  in  the  cyclic  form  [26].  Our  ELISA  results 
suggest  diat  antigens  with  seven  to  nine  residues  of  the  V3  loop,  constrained  in  a  P- 
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turn  conformation  and  presented  in  the  context  of  the  multi-valent  MUCl, 
efficiently  detect  the  same  levels  of  V3  loop  specific  antibodies  as  full  length  V3  loop 
peptides. 

A  possible  mechanism  for  the  enhanced  ability  of  mucin/HIV-PND  chimeras 
to  detect  antibodies  involves  the  process  of  antigen  adsorbtion  to  a  plastic  plate. 
When  a  multi-valent  antigen  is  bound  to  a  surface,  there  are  additional  sterically 
available  epitopes  that  can  bind  antibodies  [42,  43].  An  alternative  mechanism  is 
suggested  by  considering  the  kinetic  effects  of  a  multivalent  antibody  interacting 
with  a  multivalent  antigen.  The  kinetic  analysis  allows  the  relative  importance  of 
the  association  and  dissociation  rates  in  the  antigen-antibody  interaction  to  be 
discerned.  We  observed  a  10-fold  decrease  in  the  rate  of  dissociation  when  the  same 
antibody  interacted  with  the  120  residue  or  sbc  knob  peptide  as  compared  to  the  60 
residue  three  knob  peptide.  When  an  antibody  binds  to  the  PND  epitope  on  one  of 
the  knobs  of  the  120  residue  peptide  there  are  5  additional  knobs  available  for 
binding  the  remaining  arm  of  the  antibody.  Since  the  minimal  distance  between 
two  antibody  combining  sites  is  35A  [44]  and  the  maximal  distance  between  two 
knobs  is  32A  [16].  One  antibody  may  bind  to  two  knobs  only  in  the  120  residue  (6 
knob)  peptide.  In  the  60  residue  peptide,  the  N-terminal  knob  has  a  skewed 
orientation  due  to  the  lack  of  the  adjacent  sequence,  which  may  interfere  with 
multivalent  binding.  Differences  in  multi-valent  binding  may  change  the  overall 
equilibrium  constant  for  the  antibody-antigen  interaction. 

Another  intriguing  idea  is  the  use  of  multivalent  mucin/HIV-PND  chimeras 
as  immimogens,  either  as  a  component  of  a  multi-subunit  vaccine,  or  possibly  as  an 
immunotherapeutic  for  HIV  infected  patients.  Berberian  et  al.  recently  found  that 
the  immimoglobulin  Vh3  gene  family,  found  in  a  subset  of  IgM+  B  cells,  contains  a 
conserved  idiotope  for  HIV-gpl20  [45].  The  authors  show  that  gpl20  preferentially 
binds  to  EBV-immortalized  B  cells,  and  immortalized  tonsil  mantle  zone  B  cells 
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that  express  the  Vh3  idiotope.  The  binding  of  gpl20  could  activate  this  sub¬ 
population  of  B  cells  and  substitute  for  the  normal  Ig  ligand.  Fractions  of  IgM  from 
normal  individuals  bound  gp  120  (Kd  8.6  X  lO'^M)  and  monoclonal  IgM  antibodies 
with  non-HIV  specificity  were  found  which  neutralize  the  virus.  In  addition,  the 
authors  report  a  correlation  between  anti-gpl20  IgM  and  the  clinical  stage  of  HTV 
infection  [45].  Our  results  indicate  that  the  120  residue  peptides  with  6  knobs  are 
immunogenic  and  can  induce  both  high  titer  IgM  and  IgG  antibodies  specific  for 
only  HIV-1.  Unfortunately,  the  antisera  produced  to  date  have  not  been 
neutralizing.  We  are  preparing  multivalent  antigens  with  added  T-cell  epitopes  in 
order  to  assess  the  effect  of  T-dependent  versus  T-independent  induction  of 
antibody  on  the  neutralizing  ability  of  the  sera. 
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Figure  Legends 

Figure  1.  Serological  reactivity  by  clade  B  and  non-clade  B  HTV  infected  patients  to 
120  residue  mucV3  chimeras:  (A)  mucV3-6  which  contains  six  (IGPGRA)  residues, 
(B)  mucV3-7  which  contains  seven  (HIGPGRA)  ,  and  (C)  mucV3-9  which  contains 
nine  (MGPGRAF)  residues  of  the  HIV-V3  loop.  Strong,  Intermediate,  and  Weak 
serological  reactions  correspond  to  the  development  2.5-3.0,  0.5-2.5,  or  0.0-0.5 
Absorbance  units  after  10  minutes. 

Figure  2.  Serological  reacdvity  of  clade  B  and  non-clade  B  HIV  infected  patients  to 
peptide  antigens:  (A)  the  full  36  residue  HIV-MN  V3  loop,  (B)  a  15  residue  linear 
peptide  (DKRffllGPGRAFYTT)  of  the  HIV-MN  V3  loop.  Strong,  intermediate,  and 
weak  serological  reactions  correspond  to  the  development  2.5-3.0,  0.5-2.5,  or  0.0-0.5 
Absorbance  units  after  10  minutes. 

Figure  3.  Endpoint  ELISA  titers  of  HIV-1  sera  against  the  cyclic  and  linear  V3  loops, 
the  15  residue  linear  peptide,  the  120  residue  six  knob  chimeras  mucV3-6,  mucV3-7, 
and  mucV3-9.  (A)  Serum  from  a  patient  with  a  clade  B  PND  sequence  of 
IHIGPGRAF.  (B)  Serum  from  a  patient  with  a  non-clade  B  PND  sequence  of 

(IHMGWGRAFY). 

Figure  4.  (A)  Endpoint  ELISA  titers  of  mice  immunized  with  2, 10,  and  20  nmoles  of 
either  the  sbc  knob  120  residue  antigen,  mucV3-9  (120),  and  the  three  knob  60 
residue  antigen,  mucV3-9  (60).  (B)  Cross  reactivity  of  sera  from  mice  immunized 
with  10  nmoles  of  mucV3-9  (120)  with  the  cyclic  and  linear  HTV  (MN)  V3  loops.  All 

points  are  tiie  average  of  4  mice. 
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Table  1 

Mucin  V3  Loop  Chimeras 


Sequence  of  1  Tandem  Repeat 


No.  Residues 


1  2  3  4  5  6  7  8  910  12  14  16  18  20 


mucin 

mucV3-6  (60) 
mucV3-6  (120) 
mucV3-7  (60) 
mucV3“7  (120) 
mucV3“9  (60) 
mucV3“9  (120) 


(VTSAPDTRPAPGSTAPPAH  G)5.25 

( - GPGRAF - 

( - GPGRAF - 

_  _  _  3 

(-HIGPGRA----"""" 

(-HIGPGRA- - 

(IHIGPGRAF - - 

(IHIGPGRAF - 


105 

60 

120 

60 

120 

60 

120 


Table  2 

Monodonal  Antibody  Reactivity  Preference  for  Cyclic  V3  Loop 


No  nAB 


epitope 


V3  cyclic 


V3  linear  %  differ 


1 

447 

higpgraf 

4.0 

3.2 

25 

412 

2.4 

2.1 

14 

2 

3 

453 

krihigpgr 

3.1 

2.8 

11 

4 

386 

RIHIGPGR 

3.6 

3.2 

13 

4.7 

4.0 

17 

5 

9205 

6 

268 

RIHIGPGR 

4.7 

4.0 

18 

782 

4.8 

4.1 

17 

7 

8 

50.1 

krrihigpg 

4.5 

3.6 

25 

9 

257 

krkrihigp 

5.2 

4.4 

18 

391 

4 . 0 

3.1 

29 

10 

838 

2 . 8 

2.2 

27 

11 

419 

2.0 

1.6 

25 

12 

2.9 

2.4 

21 

181 

13 

5 . 6 

4.6 

22 

908 

14 

537 

0.9 

0.8 

13 

15 

Negative 

Controls 

0.01 

0.01 

16 

9284 

NHS 

0.03 

0.02 

17 

RMctivity  =  Response  units  (RU)  for  antibody-antigen  interaction,  divided  by  the 
baseUne  increase  in  Response  Units  after  antigen  loading  (measure  of  the  amount  of 

antigen  on  the  matrix). 


Table  3 

Kinetic  Constants  for  mABs  Binding  to  MucV3  Chimeras 


mAB  epitope 


K^sec  (assoc)  K-isec  ^  (di 


447  HIGPGRAF 


MucV3-9-60 

MucV3-9-120 


6.9  X  10+4 

5.9  X  10+4 


4.6  X  10“4 
4.4  X  10“5 


412 

MUCV3-9-60 

MUCV3-9-120 

453  KRIHIGPGR 

MucV3-9-60 


2.8 

X 

10+4 

2.2 

X 

10 

2.9 

X 

10+4 

2.0 

X 

10 

1.1  X  10+5 

1.2  X  10-3 

6.8  X  10+4 

2.3  X  10“4 

MucV3-9-120 


Table  4 

Average  Titers  of  Mice  Immunized  with  MucV3  Chimeras 
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MucV3  Chimeras 


IgG  Titer 


IgM  Titer 


1.1-120 

1.2-120 

1.3-120 


5  X  10^ 
5  X  102 
3  X  10® 


4  X  102 

5  X  lOl 
1  X  10^ 
3  X  103 
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Introduction 

Studies  on  the  feasibility  of  a  subunit  vaccine  to  protect  against  human  immunodeficiency  virus 
(HIV)  infection  have  principally  focused  on  the  third  variable  (V3)  loop  of  the  envelope  surface  protein. 
One  of  the  neutralizing  determinants  of  HIV- 1  is  located  inside  the  V3  loop.  However,  progress  toward 
a  vaccine  based  on  neutralizing  determinants  has  been  impeded  by  the  amino  acid  sequence  variability 
in  the  V3  loop  of  different  HIV  isolates.  The  elusive  nature  of  the  V3  loop  structure  prompted  us  to 
carry  out  a  systematic  study  on  different  isolates  in  an  attempt  to  identify  a  common  structural  motif  in 
the  V3  loop  regardless  of  the  amino  acid  sequence  variability.  We  have  performed  2D  NMR  structural 
studies  on  three  different  V3  loop  peptides:  MN,  Haiti,  and  RF  (Catasti  et  al.,  1995  &  1996).  The  three 
V3  loops  were  all  35  residues  long  and  S-S  bridged  at  the  terminals.  The  NMR  studies  were  carried 
out  first  in  water,  then  in  a  70%/30%  mixture  of  water/trifluoroethanol  1  (TFE).  TFE  is  a  solvent  wide 

‘^????????  ?????'^‘^??  ?????????  ?????????  ?????????  ????????? 

Figure  1  shows  that  similar  secondary  structures  are  observed  for  the  three  different  V3  loops:  a 
GPG(K/R)  crest  in  the  center  of  the  neutralizing  determinant,  two  extended  regions  flanking  the  central 
crest,  and  a  helical  region  in  the  C-terminal  domain  observed  only  in  the  water/TFE  mixture.  The  RF 
V3  peptide  did  not  dissolve  in  the  water/TFE  mixture,  therefore  we  could  run  the  experiments  only  in  an 
aqueous  solution.  Structural  prediction  studies  revealed  that  the  variability  in  sequence  and  structure  of 
the  V3  loop  is  confined  to  the  N  and  C-terminal  side  of  the  conserved  GPG  crest.  Figure  2  is  a  summary 
of  the  NMR  secondary  structural  assignments  (Catasti  et  al.,  1995  &  1996),  and  the  results  of  several 
secondary  prediction  algorithms.  With  the  exception  of  the  PSA  method,  most  of  the  algorithms  fail  to 
identify  the  alpha  helix  in  the  C-terminal  portion  of  the  V3  loops. 
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Figure  1 .  Ribbon  diagram  showing  the  average  folding  patterns  of  the  structures  of  the  MN,  Haiti  and  RF  V3  loops 
in  water  and  in  a  mixture  of  70%/30%  water/TFE.  In  each  case  the  average  is  done  over  70  sampled  low  energy 
structures.  Note  that,  in  each  case,  the  neutralizing  epitope  containing  the  central  GPG(R/K)  sequence  forms  a 
protruding  loop  even  though  the  local  structure  and  presentation  of  the  loop  in  the  different  cases  are  noticeably 
different.  Structures  that  satisfy  the  NMR  constraints  of  the  V3  loops  in  water  show  a  higher  degree  of  flexibility 
than  those  in  agreement  with  the  NMR  data  in  the  mixed  water/TFE  solvent.  This  is  due  to  the  formation  of  the 
alpha  helix  in  the  mixed  solvent.  Color  code  is  as  follow:  GPG(R/K)  crest  is  red,  extended  regions  are  green, 
disulfide  bridges  are  yellow  and  the  alpha  helical  region  is  cyan. 
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Figure  2.  Comparison  of  secondary  structure  assignments  of  the 
NMR  determined  structures  and  secondary  structure  prediction 
for  the  three  V3  loops,  MN,  Haiti  and  RF.  The  different  prediction 
algorithms  are  indicated  on  the  left.  Some  of  these  methods  are 
discussed  by  Myers  and  Farmer  in  Part  III  of  this  compendium. 
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ABSTRACT  Molecular  modeling  and  two-dimensional 
NMR  techniques  enable  us  to  identify  structural  features  in 
the  third  variable  region  (V3)  loop  of  the  human  immunode¬ 
ficiency  virus  (HIV)  surface  glycoprotein  gpl20,  in  particular 
the  principal  neutralizing  determinant  (PND),  that  remain 
conserved  despite  the  sequence  variation.  The  conserved 
structure  of  the  PND  is  a  solvent-accessible  protruding  motif 
or  a  knob,  structurally  isomorphous  with  the  immunodomi¬ 
nant  knobs  in  the  tandem  repeat  protein  of  human  mucin  1 
(MUCl)  (a  tumor  antigen  for  breast,  pancreatic,  and  ovarian 
cancer).  We  have  replaced  the  mucin  antigenic  knobs  by  the 
PND  knobs  of  the  HIV  MN  isolate  in  a  set  of  chimeric  human 
MUCl/HIV  V3  antigens.  This  produced  multivalent  HIV 
antigens  in  which  PNDs  are  located  at  regular  intervals  and 
separated  by  extended  mucin  spacers.  In  this  article  we  show 
by  two-dimensional  NMR  spectroscopy  that  the  multivalent 
antigens  preserve  the  PNDs  in  their  native  structure.  We  also 
demonstrate  by  ELISA  that  the  antigens  correctly  present  the 
PNDs  for  binding  to  monoclonal  antibodies  or  polyclonal 
antisera  from  HIV-infected  patients. 


Fusion  of  a  viral  surface  with  the  host-cell  surface  is  the  first 
step  in  the  life  cycle  of  the  human  immunodeficiency  virus 
(HIV).  The  process  of  HIV  fusion  into  a  host  cell  is  deter¬ 
mined  by  two  discrete  functional  sites  on  the  120-kDa  viral 
surface  unit  glycoprotein,  gpl20.  First,  a  segment  near  the  C 
terminus  of  gpl20  directly  binds  to  the  CD4  molecule  on  the 
surface  of  a  host  cell  (1, 2).  Next,  the  third  variable  region  ( V3) 
loop  of  gpl20  mediates  in  virus  fusion  (3,  4).  Virus  fusion  is 
abrogated  when  these  binding  events  are  prevented  by  block¬ 
ing  either  one  or  both  of  these  sites  of  gpl20.  The  blocking  of 
these  sites  is  the  primary  mechanism  of  virus  neutralization  by 
antibodies  (5-7).  Two  approaches  have  been  used  for  viral 
neutralization:  (i)  vaccination  aimed  at  generating  antibodies 
specific  for  the  principal  neutralizing  determinant  (PND), 
which  is  located  inside  the  V3  loop  of  gpl20  and  contains  a 
fairly  conserved  Gly-Pro-Gly-(Arg  or  Gin)  (GPGR/Q)  crest  at 
the  center  of  the  PND  (6,  8);  and  {ii)  the  administration  of 
soluble  CD4  receptors  (9).  However,  these  approaches  cur¬ 
rently  suffer  from  serious  drawbacks.  For  example,  because  of 
sequence  variability  in  the  V3  loop  (8,  10),  neutralizing 
antibodies  elicited  by  the  V3  loop  from  one  HIV  isolate  do  not 
neutralize  other  HIV  isolates  (11,  12).  Further,  the  short 
half-life  of  soluble  CD4  molecules  has  limited  applicability  for 
the  treatment  of  a  '‘slow”  virus  infection. 

The  V3  loop  of  HIV  surface  gpl20  is  a  potential  target  for 
protective  immunity.  This  small  34-  to  36-residue  domain 
contains  the  well-characterized  PND  as  well  as  a  cross-reactive 
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cytotoxic  T-lymphocyte  epitope  (13).  The  V3  loop  also  par¬ 
ticipates  in  vital  functional  properties  of  HIV-like  cell  tropism 
(14,  15)  and  cell  fusion  (3,  4,  16).  Antibody  binding  to  the  V3 
loop  can  interfere  with  these  processes  in  the  life  cycle  of  the 
virus.  To  exploit  the  V3  loop  as  an  antibody  target,  one  has  to 
understand  (/)  the  effect  of  sequence  variability  on  the  struc¬ 
ture  and  antigenicity  and  (ii)  the  structural  features  of  the  V3 
loop  (especially  at  the  PND)  that  remain  invariant.  Our 
previous  work  using  theoretical  studies  of  30  different  V3  loops 
(17, 18)  and  three-dimensional  (3-D)  structure  determination 
of  two  divergent  cyclic  V3  loops  (19, 20)  allows  us  to  define  the 
effect  of  sequence  variability  on  the  global  structure  of  the 
entire  V3  loop  and  the  local  structure  of  the  PND  centered 
around  the  conserved  GPGR  crest.  Irrespective  of  the  vari¬ 
ability  in  the  amino  acid  sequences  on  either  side  of  the  type 
II  GPGR  turn,  the  PNDs  of  different  V3  loops  adopt  a 
protruding  solvent-accessible  motif  (19-22)  or  knob.  This 
work  focuses  on  synthetic  polypeptide  antigens  that  contain 
PNDs  that  preserve  and  present  the  same  conserved  structural 
features  as  in  the  “native”  V3  loop.  Such  a  design  requires  a 
polypeptide  construct  that  also  contains  a  protruding  motif 
like  the  PNDs  in  the  V3  loop.  The  tandem  repeat  protein, 
human  mucin  1  (MUCl)  (a  tumor  antigen  for  breast,  pancre¬ 
atic,  and  ovarian  cancer),  provides  us  with  such  a  structural 
motif  (23-25).  We  construct  a  set  of  chimeric  human  MUCl- 
HIV  V3  loop  proteins  in  which  HIV  PNDs  replace  the 
immunodominant  knobs  of  MUCl  tandem  repeats.  We  show 
by  two-dimensional  (2-D)  NMR  that  the  PNDs  in  the  chimeric 
proteins  preserve  the  same  structure  as  in  the  native  V3  loops. 
Finally,  we  show  that  these  antigens  are  also  able  to  bind 
polyclonal  antisera  from  HIV-infected  patients  and  type- 
specific  monoclonal  antibodies  (mAb). 

MATERIALS  AND  METHODS 

The  60-  and  120-residue  peptide  amides  were  synthesized,  and 
the  products  of  the  synthesis  were  deprotectei  cleaved  from 
the  resin  support,  purified  by  HPLC,  and  subjected  to  molec¬ 
ular  weight  determination  by  electrospray  mass  spectroscopy 
as  described  (26).  The  names  and  sequences  in  single-letter 
amino  acid  code  of  one  copy  of  the  tandem  repeat  antigens 
described  here  are  as  follows:  human  MUCl,  VTSAPDTR- 
PAPGSTAPPAHG;  MUC1-V3  1.1,  VTSGPGRAFAPGST- 
APPAHG;  MUC1-V3  1.2,  HIGPGRAPAPGSTAPPAHGV; 


Abbreviations:  HIV,  human  immunodeficiency  virus;  PND,  principal 
neutralizing  determinant;  NOE,  nuclear  Overhauser  effect;  NOESY, 
NOE  spectroscopy;  gpl20,  120-kDa  glycoprotein;  V3,  variable  region 
third  loop  of  HIV  surface  gpl20;  mAb,  monoclonal  antibody;  MUCl, 
human  mucin  1;  GPGR/Q,  Gly-Pro-Gly-(Arg  or  Gin);  2-D  and  3-D, 
two-  and  three-dimensional;  HIGPGRA,  His-Ile-Gly-Pro-Gly-Arg- 
Ala. 
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and  MUC1-V3  1.3,  IHIGPGRAFAPGSTAPPAHG.  In  each 
case,  peptides  with  three  and  six  complete  tandem  repeats 
were  successfully  synthesized.  The  cyclic  and  linear  V3  pep¬ 
tides  from  the  HIV  isolate  MN  (HIV-MN)  have  the  single- 
letter-code  sequence  CTRPNYNKRKRIHIGPGRAFYTTK- 
NIIGTIRQAHC,  and  the  small  15-residue  linear  peptide  has 
the  sequence  DKRIHIGPGRAFYTT.  The  enzyme-linked 
immunosorbent  assay  (ELISA)  was  performed  as  described 
(27). 

All  NMR  experiments  on  the  MUC1-V3  60-residue  antigen 
were  carried  out  on  the  600-MHz  Bruker  spectrometer  at  the 
University  of  Alabama  at  Birmingham.  All  NMR  spectra  were 
collected  at  10°C  with  peptide  concentration  at  5  mM  (pH  5.5). 
All  2-D  data  were  acquired  in  the  phase-sensitive  mode  with 
the  presaturation  of  the  ^H^HO  signal  during  the  relaxation 
delay  (28,  29). 

As  previously  described  (18, 19),  a  set  of  distance  constraints 
were  derived  by  analyzing  the  NMR  data  with  the  aid  of 
full-matrix  nuclear  Overhauser  effect  (NOE)  spectroscopy 
(NOESY)  simulations,  associated  R-factor  test,  and  energy 
calculations.  Analyses  of  the  2-D  NMR  data  of  the  MUC1~V3 
60-residue  peptide  produced  220  interproton  distance  con¬ 
straints.  The  energy  term  EDIST  (14,  38)  was  added  to  the 
force-field  as  described  by  Scheraga  and  coworkers  (30). 
Monte  Carlo-simulated  annealing  (31)  was  performed  to 
obtain  a  set  of  structures  consistent  with  NMR  data.  The 
maximum  step  size  of  the  torsion  angles  was  set  at  15  degrees, 
which  produced  an  acceptance  ratio  of  0.20-0.50  for  the 
50,000-step  Monte  Carlo  cycle  at  each  temperature.  Full- 
matrix  NOESY  calculations  were  repeated  for  the  final  150 
low-energy  MUC1-V3  structures.  These  structures  were  an¬ 
alyzed  in  terms  of  their  energies,  end-to-end  lengths  (Re), 
relative  orientations  of  the  knobs,  and  torsion  angle  parame¬ 
ters. 

RESULTS 

Principles  of  Design.  We  previously  determined  NMR 
structures  of  two  disulfide-bridged  V3  loops:  a  Thailand  HIV 


isolate  (19)  and  the  HIV-MN  isolate  (38).  Amino  acid  se¬ 
quences  of  the  PNDs  of  these  two  V3  loops  are  quite  different. 
However,  as  shown  by  superimposing  two  PNDs  (Fig.  1  Left), 
both  of  them  form  similar  protruding  motifs.  In  the  case  of  the 
HIV  Thailand  isolate,  the  central  GPGQ  forms  a  type  II  turn 
and  a  solvent-accessible  tip;  similarly  in  the  case  of  HIV-MN, 
the  central  GPGR  forms  the  type  II  turn  and  the  accessible  tip. 
In  spite  of  the  fact  that  the  amino  acid  sequences  flanking  the 
central  type  II  turn  on  the  N-  and  C-terminal  sides  are 
different,  the  polypeptide  backbone  and  side-chain  orienta¬ 
tions  in  the  flanking  regions  are  very  similar  in  the  two  cases 
(Fig.  1  Left).  However,  when  the  PND  is  presented  in  the 
context  of  the  V3  loop  or  gpl20,  the  variability  of  regions 
flanking  the  central  PND  residues  masks  the  conformational 
purity  of  the  PND. 

These  protruding  motifs  (or  knobs)  are  predicted  from 
molecular  modeling  studies  (17,  18)  of  a  large  set  of  different 
V3  loops.  The  “knob-like”  structure  is  also  present  in  the 
tandem  repeat  domain  of  a  protein  totally  unrelated  to  HIV 
V3  loops.  Fig.  1  Right  illustrates  the  surprising  result  found 
upon  solving  the  structure  of  three  tandem  repeats  of  the 
human  breast  and  pancreatic  tumor  antigen  MUCl,  (PDTR- 
PAPGSTAPPAHGVTSA)3.  In  this  system,  we  apin  found  a 
knob-like  motif  that  is  crested  by  a  type  II  turn  (isomorphous 
with  the  PND  of  HIV),  which  is  immunodominant  for  humoral 
immune  responses  (32,  33).  As  shown  by  detailed  NMR 
analyses  (25),  the  elongated  mucin  structure  contains  knobs 
that  project  away  from  the  long  axis  of  the  molecule  and  are 
connected  by  extended  spacers  (Fig.  1  Right).  In  the  mucin 
structure,  Ala-Pro-Asp-Thr  from  positions  0-3  in  the  above 
repeat  (where  Ala  at  position  0  is  the  amino  acid  from  the 
previous  repeat)  forms  the  type  II  turn  occupying  the  solvent- 
accessible  tip  of  the  knob;  this  tip  is  the  immunodominant 
antigenic  site  in  MUCl.  In  our  antigen-engineering  approach, 
we  chemically  synthesize  a  series  of  MUC1-V3  chimeric 
polypeptides  in  which  mucin  immunodominant  knobs  are 
replaced  by  the  HIV-MN  PNDs  (Fig.  1  Right).  The  ability  of 
these  chimeric  proteins  to  act  as  HIV  antigens  requires 
fulfillment  of  two  criteria:  (/)  the  PNDs  in  MUC1-V3  antigens 


Fig.  1.  (Left)  Superimposition  of  the  protruding  motifs  of  two  NMR  structures:  the  V3  loop  from  the  HIV-MN  isolate  (designated  MN)  and 
that  from  the  Thailand  TN243  isolate  (named  TN).  The  sequences  of  two  motifs  in  single-letter  amino  acid  code  are:  MN,  RIHIGPGRAFYT; 
and  TN,  SITIGPGQVFYR.  Note  that  the  GPGR  or  GPGQ  crests  are  oriented  in  the  same  way.  {Right)  The  principle  of  design.  The  HIV  PND 
sequences  above  the  MUCl  sequences  actually  replace  the  MUCl  residues  in  the  chimeras. 
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must  be  structurally  equivalent  with  the  native  V3  loop,  and 
(ii)  the  surface  accessibility  to  antibodies  of  the  PNDs  in 
MUC1-V3  antigens  must  be  as  good  as  or  better  than  in  the 
native  V3  loop.  Below,  we  discuss  how  these  criteria  are 
fulfilled  for  the  MUC1-V3  antigens  that  we  have  designed 
so  far. 

Preservation  of  Structure.  We  have  chemically  synthesized 
60-  and  120-residue  peptides  of  three  different  MUC1~V3 
antigens  in  which  three  different  lengths  of  HIV-MN  PND 
sequences  are  inserted.  The  lengths  of  the  PND  inserts  are  6, 
7,  and  9  amino  acids.  The  names  and  sequences  of  the  chimeric 
antigens  are  listed  in  Materials  and  Methods.  The  start  of  the 
sequence  is  chosen  in  such  a  way  that  each  20-amino  acid 
repeat  has  one  knob  (Fig.  1  Right).  Therefore,  by  varying  the 
number  of  repeats,  we  can  vary  the  number  of  knobs  and  study 
the  related  effect  on  the  structure  and  dynamics  of  the  antigen 
and  on  the  associated  antibody-binding  affinity.  The  binding 
data  indicates  that  we  need  at  least  7  PND  amino  acids  for 
optimum  binding. 

The  NOESY  (400  ms  of  mixing)  fingerprint  region  (con¬ 
taining  cross-peaks  due  to  coupling  between  backbone  and 
H“)  of  the  MUC1-V3  1.2  60-residue  peptide  reveals  that  the 
20  residues  in  the  sequence  repeat  also  forms  the  structural 
repeat — i.e.,  the  protons  from  the  20  amino  acids  in  all  three 
repeats  sample  the  same  chemical-shift  environment  and 
structure  ( Val-60  being  the  only  exception).  Comparison  of  the 
NOESY  cross-section  from  the  MUC1-V3  1.2  60-residue 
peptide  with  that  of  the  MUC1-V3  1.2  120-residue  peptide 
suggests  that  local  structures  of  different  20-amino  acid  re¬ 
peats  in  the  same  antigen  (60  or  120  residue)  are  remarkably 
similar.  However,  the  extent  of  flexibility  and  the  nature  of 
dynamics  can  be  different  in  these  two  antigens.  The  diagnostic 
NOE  pattern  of  importance  is  the  one  involving  the  HIV  PND 
epitopes,  His-Ile-Gly-Pro-Gly-Arg-Ala  (HIGPGRA),  of 
MUC1-V3  1.2.  The  H“(P,)-HN(G,+i),  H«(P,)-HN(R,+2),  and 
H^(Gi)-H^(R/+i)  NOEs  are  indicative  of  a  type  II  jS-turn  of 
GPGR  centered  around  the  proline  and  glycine  residues  (19, 
34,  35).  The  sequential  NOE  pattern  involving  histidine, 
isoleucine  and  alanine  residues  of  HIGPGRA  defines  the 
backbone  and  side-chain  orientations  of  the  flanking  residues 
in  the  PND  with  respect  to  the  central  GPGR  type  II  turn.  The 
NOESY  (200  ms  of  mixing)  fingerprint  (H^-H“)  region  of 
MUC1-V3  1.2  120-residue  peptide  has  features  similar  to 
those  in  the  MUC1-V3  1.2  60-residue  peptide  except  that  the 
weak  H“(P,)-H^(R/+2)  NOE  present  is  not  seen. 

The  analyses  of  the  NMR  data  of  MUC1-V3  1.2  60-residue 
peptide  revealed  intraresidue  NOEs  involving  H"-H^,  H^- 
H^,  H^-H^,  H«-H/5,  H^-Ht',  etc,,  and  interresidue  NOEs 
involving  H“(residue  /)-H^(residue  /+1),  H^(z)-H%  +  1), 
HN(z>l)-HN(i+l),  and  some  HX/)-HN(/-hl).  Full-matrix 
analyses  and  the  associated  R-factor  test  with  respect  to  the 
NOESY  data  at  200  and  400  ms  of  mixing  produced  220 
distance  constraints  for  the  MUC1-V3  1.2  60-residue  peptide 
(36).  Simulated  annealing  subject  to  the  distance  constraints 
resulted  in  a  set  of  stable  (low  energy)  structures  in  agreement 
with  the  NMR  data.  All  of  the  structures  shared  the  following 
common  features:  (i)  a  tight  GPGR  type  II  turn;  {ii)  a 
protruding  motif  (or  knob — colored  red)  consisting  of  resi¬ 
dues  1-11  in  each  repeat,  with  the  GPGR  forming  the  solvent- 
accessible  tip;  {Hi)  seven  residues  from  the  HIV-MN  PND  in 
MUC1-V3  that  are  structurally  isomorphous  with  the  same 
residues  in  the  cyclic  HIV-MN  V3  (Fig.  2);  and  (/v)  an 
extended  spacer  (colored  green),  consisting  of  p-strand  and 
polyproline  structures,  connecting  the  knobs.  We  noticed 
significant  conformational  flexibility  in  the  MUC1-V3  1.2 
60-residue  peptide  in  terms  of  the  end-to-end  length  (Re)  and 
in  the  relative  distance  and  orientations  of  the  knobs.  This 
flexibility  neither  violates  the  NMR  distance  constraints  nor 
disrupts  structural  features  i-iv  discussed  above.  Fig.  3  shows 
three  different  folded  forms  of  the  MUC1-V3  1.2  60-residue 


Fig.  2.  Structural  similarity  between  the  HIV  PND  isotope  HIG¬ 
PGRA  in  the  HIV-MN  cyclic  V3  (green)  and  in  the  central  knob  of 
chimeric  MUC1-V3  1,2  60-residue  peptide.  The  residues  on  the  N 
terminus  are  on  the  left,  while  the  residues  on  the  C  terminus  are  on 
the  right.  Note  that  P  occupies  the  left  corner  of  the  crest. 

peptide  in  agreement  with  the  NMR  data.  Fig.  3  Left  shows  a 
structure  with  Rc  =  80  A  and  with  knobs  alternating  on  two 
sides  of  the  long  axis  of  the  molecule,  whereas  Fig.  3  Center 
shows  a  structure  with  Rg  =  80  A  but  with  the  knobs  located 
on  the  same  side  of  the  long  axis.  A  more  compactly  folded 
structure  with  Rg  =  30  A  and  with  the  first  and  the  third  knobs 
close  together  is  shown  in  Fig.  3  Right.  The  conformational 
variants  in  Fig.  3  involve  only  minor  variations  in  backbone 
torsion  angles  (i.e.,  small  correlated  changes  produce  large 
cumulative  deviations).  Also  note  that  some  degrees  of  flex¬ 
ibility  possible  for  the  60-residue  peptide  may  be  sterically 
prohibited  for  the  120-residue  peptide.  For  example,  the 
progressive  extension  of  the  60-residue  chain  at  the  bottom  in 
Fig.  3  to  accommodate  the  120~residue  peptide  may  be  steri¬ 
cally  excluded  because  of  the  clash  of  the  third  and  fourth 
knobs.  Therefore,  for  steric  reasons  the  120-residue  chain  has 
to  necessarily  unfurl.  This  implies,  even  though  the  local 
structures  of  the  20-amino  acid  repeats  may  be  identical  in  the 
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Fig.  3.  Flexibility  in  the  arrangement  of  the  tandem  repeats  in 
MUC1-V3  1.2  60-residue  peptide. 
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60-  and  120-residue  peptides,  that  the  global  folding  and 
dynamics  can  be  different  in  these  two  polypeptides.  These 
differences,  in  addition  to  the  difference  in  the  number  of 
binding  sites,  are  likely  to  alter  antibody  binding  properties  for 
these  two  polypeptides.  The  results  in  Figs.  2  and  3  prove  that 
the  PND  inside  the  chimeric  construct  is  structurally  similar  to 
that  in  the  native  V3  loop. 

Presentation  of  Structure.  Table  1  lists  the  ELISA  reactivity 
of  various  V3  antigens  against  39  different  polyclonal  antisera 
from  HIV-infected  patients.  Also  shown  in  the  table  are  the 
PND  sequences  at  the  tip  of  the  predominant  HIV  V3  loops 
from  the  corresponding  HIV-infected  patients.  The  reactivity 
of  the  mucin  105-mer  is  also  shown  on  a  relative  scale  of  0  to 
4  (0  being  least  reactive).  Note  that  37  of  39  antisera  in  Table 
1  show  no  reactivity  with  the  mucin  105-mer;  the  remaining 


two  show  only  weak  binding.  However,  MUC1-V3  120-residue 
antigens  show  reactivity  comparable  to  the  full-length  linear 
and  cyclic  V3  loops  in  the  majority  of  the  cases.  MUC1-V3  1.3, 
which  has  the  longest  PND  segment,  is  the  most  reactive 
among  all  three  MUC1-V3  antigens.  The  conformational 
purity  of  the  PND  epitope  in  MUC1-V3  1.3  (and  not  the 
length  of  the  sequence)  decides  the  high  reactivity.  By  com¬ 
parison,  the  14-residue  linear  V3  peptide  (with  5  more  residues 
than  the  PND  in  MUC1~V3  1.3)  is  much  less  reactive.  This  is 
because  the  small  linear  V3  peptide  is  less  structured  than  the 
PND  in  MUC1-V3  1.3.  All  antisera  specific  for  Ile-Gly-Pro- 
Gly-Arg-Ala-(±Phe)  [IGPGRA(F);  i.e.,  with  or  without  phe¬ 
nylalanine  in  the  epitope]  show  strong  reactivity  toward 
MUC1-V3  1.2.  These  reactivities  compare  well  with  those 
displayed  by  the  HIV-MN  cyclic  V3  loop  against  the  same 
antisera. 


Table  1.  Reactivity  of  patient  polyclonal  antisera  and  mAbs  with  V3  antigens 

Reactivity  with  V3  antigens 


Antibody 

HIV-MN  V3  loops 

MUC1-V3  chimeric 
120-residue 

Reactivity 

with 

Name 

V3  loop  PND 
sequence 

Cyclic 

Linear 

Small 

linear 

1.1 

1.2 

1.3 

mucin 

105-mer 

HIV  patient  serum 

HONDO  1 

IPIGPGRAF 

+4 

+4 

4-4 

4-4 

4-2 

4-4 

4-1 

HOND02 

IHIGPGRAF 

+4 

+4 

4-4 

-h4 

-h4 

-h4 

0 

HOND03 

INIGPGRAW 

+4 

+4 

4-2 

4-1 

4-3 

4-4 

0 

HOND04 

IHMGPGGAF 

+4 

+4 

0 

0 

-h2 

4-4 

0 

HOND05 

+4 

+4 

+4 

0 

-h4 

-h4 

0 

HOND06 

lYIGPGRAF 

+4 

+4 

+4 

+4 

+4 

4-4 

0 

HOND08 

IHIGPGSAW 

+4 

+4 

4-2 

4-1 

4-4 

4-4 

0 

HOND09 

IHIGPGRAF 

+4 

+4 

4-1 

+  1 

0 

+4 

0 

HONDIO 

IPIGPGRAF 

+4 

4-4 

+4 

-h4 

4-4 

+4 

0 

HOND12 

IPLGPGRAF 

4-4 

4-4 

4-2 

+3 

-h2 

4-4 

0 

HOND13 

4-4 

+4 

+4 

-h3 

-h4 

-h4 

0 

HOND14 

IHIGPGRAF 

+4 

4-4 

4-4 

4-1 

+4 

+4 

0 

HOND15 

IHIGPGRAF 

+4 

4-4 

-H3 

-h2 

+4 

-h4 

0 

HOND16 

IHIGPGSAW 

4-4 

4-4 

+4 

0 

0 

4-4 

0 

HOND17 

IHIGPGRAF 

4-4 

+4 

4-4 

-fl 

0 

-h4 

0 

HOND18 

VTIGPGKVW 

+3 

+4 

4-1 

0 

-h3 

-h4 

0 

HOND19 

INIGPGRAF 

+4 

4-4 

4-3 

-h4 

-h4 

4-4 

0 

HOND20 

+4 

4-4 

4-1 

+4 

-44 

-h4 

0 

HOND51 

VHIGPGRAF 

+4 

4-4 

+4 

~h4 

-h4 

-h4 

0 

HOND52 

lYIGPGRAF 

+4 

+4 

4-4 

+4 

-h4 

+4 

0 

HOND53 

VRIGPGRAF 

+4 

+4 

4-1 

0 

0 

+  1 

0 

HOND54 

+4 

4-4 

4-1 

0 

+4 

4-4 

0 

HOND55 

INIGPGRAF 

4-4 

+4 

-f3 

0 

0 

-h4 

0 

HOND58 

INIGPGRAF 

4-4 

•4-4 

+4 

+4 

+4 

-h4 

0 

HOND59 

VHIGPGRAF 

+3 

-i-4 

-hi 

0 

0 

4-2 

0 

HOND60 

INIGPGRAF 

+4 

4-4 

4-4 

4-2 

4-4 

-h4 

0 

HOND61 

4-4 

+4 

4-2 

0 

4-4 

-h4 

0 

HOND62 

4-4 

4-4 

+2 

4-1 

-hi 

-h4 

0 

RW9223 

IHIGPGRAF 

+4 

4-4 

4-2 

-h4 

-h4 

-h4 

0 

RW9225 

VRIGPGQTF 

0 

0 

0 

0 

0 

-h3 

0 

BR9203 

IHMGWGRAF 

0 

0 

0 

0 

0 

0 

0 

BR9221 

IHMGWGRAF 

+2 

0 

0 

0 

0 

0 

0 

BR9225 

IRIGPGQAF 

+4 

+4 

0 

0 

+1 

-h3 

4-2 

BR9228 

IHMGWGRTF 

0 

0 

0 

0 

0 

0 

0 

TH9201 

INIGPGQVF 

0 

0 

0 

0 

0 

0 

0 

TH9206 

ITIGPGQVF 

0 

0 

0 

0 

0 

0 

0 

TH9209 

ITIGPGQVF 

0 

0 

0 

0 

0 

0 

0 

UG9238 

TPIGQGQVL 

0 

0 

0 

0 

0 

0 

0 

UG9365 

TSIGLGQAL 

0 

0 

0 

0 

0 

4-4 

0 

mAb 

268-D 

HIGPGR 

+2 

+2 

— 

0 

-h2 

4-2 

0 

R/V3-50.1 

RIHIG 

+4 

+2 

— 

0 

0 

-h4 

0 

Reactivity:  0,  absorbance =  1-5 X  the  average  of  six  normal  controls;  -Fl,^  =  5X  the  average  of  six  normal  controls  up 
to  1.0/1405  unit  in  ELISA;  +2,  A  =  1— I.5/I405  units  in  ELISA;  +3,  A  =  1.5—2.0/1405  units  in  ELISA;  -\-4,A  =  >2.0  A405  units 
in  ELISA. 
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Two  neutralizing  mAbs  show  distinct  antigen  specificity 
(Table  1).  The  human  mAb  268-D  (12)  is  specific  for  the 
sequence  HIGPGR  and  reacts  with  MUC1-V3  1.2  and 
MUC1-V3  1.3  antigens,  which  contain  this  sequence,  but  not 
with  MUC1-V3  1.1,  which  does  not  contain  this  full  sequence. 
The  mouse  mAb  R/V3-50.1,  obtained  by  immunization  with  a 
cyclic  V3  loop  (37^  is  specific  for  the  sequence  Arg-Ile-His- 
Ile-Gly.  This  mAb  only  reacts  with  MUC1-V3  1.3. 

DISCUSSION 

MUC1-V3  antigens  are  unique  in  the  following  ways.  (/ )  NMR 
and  antibody-binding  data  verify  that  they  reproduce  the 
native  structure  of  the  PNDs  even  when  they  are  presented  in 
the  context  of  a  totally  unrelated  protein  like  MUCl.  (li) 
Immunogens  containing  identical  PNDs  within  the  MUCl 
chimeras  effectively  allow  enhanced  presentation  of  a  con¬ 
served  structural  feature  of  the  virus  in  a  fashion  not  possible 
with  nonchimeric  HIV  antigens.  The  true  advantage  of  this 
approach  will  be  to  induce  either  T-cell-dependent  or  T-cell- 
independent  antibody  responses  to  the  PND,  depending  on  the 
precise  construction  of  the  antigen.  (Hi)  Multiple  PNDs, 
present  in  these  chimeric  proteins,  may  be  advantageous  in 
enhancing  the  immune  response  by  significantly  increasing  the 
affinity  of  antibody  binding.  The  importance  of  multiple  PNDs 
being  present  in  the  same  antigen  becomes  clear  by  analyzing 
the  relative  binding  of  MUC1-V3  1.2  60-  and  120-residue 
peptides  to  different  antisera.  The  data  show  that  the  120- 
residue  peptide  is  a  better  ligand  than  the  60-residue  peptide 
for  the  majority  of  the  antisera  we  tested  (data  not  shown). 
This  is  probably  because  the  higher  number  of  PND  knobs  in 
the  120-residue  peptides  are  correctly  disposed  along  the  long 
axis  of  the  molecule  to  facilitate  the  binding  of  bivalent 
antibodies,  (iv)  Alternatively,  the  nature  of  the  MUC1-V3 
structure  (Fig.  3)  suggests  that  if  two  or  more  different  PNDs 
are  grafted  alternately  along  the  chain,  there  is  enough  flex¬ 
ibility  in  the  spacers  so  that  two  or  more  antibodies  specific  for 
two  different  PNDs  can  both  bind  bivalently,  interdigitating 
along  the  molecule.  Finally,  there  is  no  reason  why  more  than 
two  PNDs  cannot  be  introduced  in  the  molecule.  This  may  be 
critical  in  designing  vaccines  for  a  highly  mutating  pathogen 
like  HIV. 
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The  third  variable  (V3)  loop  of  HIV-1  surface  glyco¬ 
protein,  gpl20,  has  been  the  target  of  neutralizing  an¬ 
tibodies.  However,  sequence  variation  inside  the  V3 
loop  diminishes  its  effectiveness  as  a  potential  vaccine 
against  HIV-1.  The  elusive  nature  of  the  V3  loop  struc¬ 
ture  prompted  us  to  carry  out  a  systematic  study  on 
different  isolates  in  an  attempt  to  identify  a  common 
structural  motif  in  the  V3  loop  regardless  of  the  amino 
acid  sequence  variability.  We  have  previously  deter¬ 
mined  the  structural  features  of  two  V3  loops:  V3  Thai¬ 
land  and  V3  MN.  In  this  paper,  we  present  the  struc¬ 
ture  of  two  other  variants:  V3  Haiti  and  V3  RF.  Our 
results  show  that  similar  secondary  structures  are  ob¬ 
served  in  all  the  four  V3  loops:  a  GPG(R/K/Q)  crest  in 
the  center  of  the  neutralizing  domain,  two  extended 
regions  flanking  the  central  crest,  and  a  helical  region 
in  the  C-terminal  domain.  For  the  Haitian  V3  loop,  we 
also  show  how  the  conserved  structural  features  are 
masked  through  a  conformational  switch  encoded  in 
the  amino  acid  sequences  on  the  C-terminal  side  of  the 
GPGK  crest. 


A  neutralizing  determinant  (1-8)  located  inside  the  V3^  loop 
of  the  envelope  glycoprotein,  gpl20,  has  been  the  target  for 
protective  immunity  against  the  human  immunodeficiency  vi¬ 
rus,  type  1  (HIV-1).  However,  the  amino  acid  sequence  varia¬ 
tion  within  the  V3  loop  has  eluded  the  effectiveness  of  V3-based 
vaccine  design  (4,  5).  Antibodies  against  the  V3  loop  generally 
exhibit  type-specific  neutralization  profiles  (6,  7),  although  a 
subset  of  anti-V3  antibodies  specific  for  the  less  variable  ele¬ 
ments  of  the  V3  loop  show  a  broader  range  of  neutralizing 
activity  (7,  8).  To  better  understand  the  effect  of  sequence 
variation  on  the  structure  and  antigenicity  of  the  HIV-V3  loop, 
we  developed  a  method  combining  molecular  modeling  (9,  28) 
and  two-dimensional  NMR  (10,  11)  to  analyze  the  global  struc¬ 
ture  of  the  entire  cyclic  V3  loop  and  the  local  structure  at  the 
neutralizing  determinant.  We  attempted  to  answer  two  specific 
questions:  (i)  Are  there  conserved  secondary  structural  ele¬ 
ments  inside  the  V3  loop  in  spite  of  sequence  variation?  and  (ii) 
Can  the  sequence  variation  inside  the  V3  loop  mask  this  con¬ 
served  secondary  structure?  Recently,  we  have  shown  (12)  that 
in  spite  of  the  observed  sequence  variation,  a  conserved  sec- 
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ondary  structure  is  located  inside  the  V3  loop.  This  structure 
consists  of  a  solvent-accessible  protruding  motif  (or  a  knob) 
spanning  8-10  residues  with  a  central  GPG(Q/R/K)  type  II 
turn  at  the  crest  of  the  knob  (10-12).  In  this  article,  we  dem¬ 
onstrate  how  amino  acid  sequence  variation  flanking  the  GPG 
crest  can  camouflage  an  otherwise  conformationally  pure 
epitope.  For  this  purpose,  we  performed  two-dimensional  NMR 
and  molecular  modeling  studies  on  two  cyclic  V3  loops  from  the 
Haitian  and  RF  isolates  (13): 

1  36 

V3-Haiti  cctrpndntrksipmgpgkafyatgdiignirqahc  net  charge -i-3 

V3-RF  CCTRPNNNTRKSITKGPGRVIYATGQIIGDIRKAHC  net  charge  +5 

(Cysteines  at  position  2  and  36  are  S-S  bridged;  the  first 
cysteine  that  is  underlined  in  the  sequence  has  a  protective 
group  on  S;  site-specific  differences  in  sequence  are  marked  in 
bold). 

MATERIALS  AND  METHODS 

Synthesis  and  Purification — The  cyclic  Haitian  V3  loop  was  synthe¬ 
sized  and  purified  by  Dr.  Anita  Hong  (Anaspec,  CA).  The  cyclic  RF  V3 
loop  was  synthesized  and  purified  by  Dr.  G.  M.  Anantharamaiah,  Uni¬ 
versity  of  Alabama  at  Birmingham.  Both  Drs.  Hong  and  Ananthara¬ 
maiah  provided  their  services  under  a  contract  with  the  AIDS  Division 
of  the  National  Institutes  of  Health. 

NMR  Spectroscopy — ^All  NMR  experiments  on  the  RF  V3  loop  were 
carried  out  on  the  600  MHz  Bruker  spectrometer  at  the  University  of 
Alabama  at  Birmingham,  whereas  the  data  on  the  Haitian  V3  loop  were 
collected  on  a  500  MHz  Bruker  AMX  spectrometer  at  Chemistry,  Sci¬ 
ence,  and  Technology-4,  Los  Alamos  National  Laboratory.  NMR  spectra 
were  collected  at  10  °C  with  3  mM  peptide  concentration  (pH  5.5)  for  the 
Haitian  V3  loop  and  at  5  mM  peptide  concentration  (pH  5.5)  for  the  RF 
V3  loop.  All  two-dimensional  data  were  acquired  in  the  phase-sensitive 
mode  with  the  presaturation  of  the  HDO  signal  during  the  relaxation 
delay.  DQF-COSY  data  (14)  were  collected  with  the  following  acquisi¬ 
tion  parameters:  t2  ^  4096,  tl  =  1024,  relaxation  delay  =  1.5  s,  number 
of  transients  ^  32  for  the  Haitian  V3  loop  and  number  of  transients  ^ 
48  for  the  RF  V3  loop.  Total  correlated  spectroscopy  data  (14)  were 
collected  with  the  following  parameters:  t2  =  2048,  tl  =  1024,  relax¬ 
ation  delay  =  1.5  s,  number  of  transients  =  32/48,  isotropic  mixing  =  70 
ms.  NOESY  (14)  data  were  collected  with  similar  acquisition  parame¬ 
ters  and  with  75  and  250  ms  of  mixing  time  (r^)  for  the  Haitian  V3  loop 
and  200  and  400  ms  for  the  RF  V3  loop.  The  sequential  assignments^ 
were  performed  by  combining  the  total  correlated  spectroscopy,  DQF- 
COSY,  and  NOESY  data,  processed  on  a  SGI  Indigo  instrument  using 
the  Felix  software  (BioSym  Inc.). 

Structure  Derivation — Two-dimensional  NMR  data  of  the  Haitian  V3 
loop  in  water  and  in  TFE/water  mixture  and  for  the  RF  V3  loop  in  water 
were  analyzed  with  the  aid  of  full-matrix  NOESY  simulations,  associ¬ 
ated  R-factor  test,  and  energy  calculations  (10,  11).  This  produced  a  set 
of  distance  constraints  (9-12,  28).  The  energy  term  (EDIST  defining  the 


^  Authors  on  request  will  provide  additional  data  containing  proton 
NMR  assignments,  c/hn-h“  coupling  constants,  NOE  volumes,  NOE- 
derived  distances,  cartesian  coordinates,  torsional  angle  data  of  the  the 
three  structures:  V3  Haiti  in  water  and  in  the  mixed  water/TFE  solvent 
and  V3  RF  in  water.  The  surface  accessibility  data  of  the  three  struc¬ 
tures  in  Figs.  6-8  are  also  available  from  the  authors  on  request. 
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Fig.  1.  NOESY  (mixing  time  =  250  ms)  and  DQF-COSY  cross- 
sections  of  the  cyclic  Haitian  V3  loop  in  water  (peptide  concen¬ 
tration  =  3.0  mM,  pH  5.5,  temperature  =  10  °C).  A,  the  fingerprint 
HN-H“  region.  J3,  DQF-COSY  HN-H“  cross-section,  C,  the  HN-HN 
region.  For  NOESY  experiments,  the  acquisition  parameters  were  as 


distance  constraints)  is  added  to  the  force  field  as  in  the  work  of 
Scheraga  and  co-workers  (15).  Monte  Carlo  simulated  annealing  (10, 
11,  16)  is  performed  to  obtain  a  set  of  structures  in  agreement  with 
NMR  data.  The  maximum  step  size  of  the  torsion  angles  was  set  at  15 
which  produced  an  acceptance  ratio  of  0.20-0.30  for  the  50,000  step  MC 
cycle  at  each  temperature.  Full-matrix  NOESY  calculations  were  re¬ 
peated  for  the  final  50  low  energy  structures  of  the  V3  loop.  We  define 
distances  from  the  NOE  data  in  terms  of  the  upper  and  lower  limits 
(range  is  above  0.5  A).  However,  in  our  sampled  structures  the  viola¬ 
tions  of  these  distances  are  ~0.22  A.  This  implies  that  the  uncertainty 
in  distance  estimate  is  always  above  0.7  A. 

RESULTS 

NMR  Experiment  on  the  Haitian  V3  Loop— Fig.  lA  shows  the 
NOESY  fingerprint  HN-H“  region  of  the  Haitian  V3  loop  in 
water  at  10  °C  for  250  ms  of  mixing.  Fig.  IB  shows  the  DQF- 
COSY  fingerprint  HN-H“  region  of  the  Haitian  V3  loop  in 
water  at  10  °C;  the  ranges  of  cp  values  are  estimated  from 
JnN-Ha  coupling  data.  Note  the  presence  of  intra-residue 
HN-H“  and  inter-residue  HN(i+l)-H"(i)  cross-peaks.  Proline 
residues  (Pro^,  Pro^^,  and  Pro^'^)  show  H®(Pi+l)-H“(i)  cross¬ 
peaks.  Note  that  residues  CysS  Cys^,  Thr^,  and  Arg^  do  not 
show  any  cross-peaks.  Fig.  1C  shows  the  NOESY  HN-HN  re¬ 
gion  of  the  Haitian  V3  loop  in  water  at  10  °C  for  300  ms  of 
mixing.  Although  a  set  of  sequential  HN-HN  cross-peaks  are 
observed  in  the  C-terminal  stretch  (residues  28-33)  of  the  V3 
loop,  no  medium  range  NOEs  like  H"(i)-H^(i+3)  or  HN(i)- 
H“(i+3)  cross-peaks  are  seen  as  a  corroboratory  evidence  of  a 
helical  stretch.  Fig.  3  summarizes  the  NOE  data  of  the  Haitian 
V3  loop  in  aqueous  environment  from  the  analyses  of  the  data 
at  100  and  300  ms  of  mixing.  Note  that  only  a  few  H^(i)- 
HN(i+l)  cross-peaks  are  observed  for  the  Haitian  V3  loop  in 
the  aqueous  environment.  Seven  nonsequential  NOESY  cross¬ 
peaks  are  observed  (see  panel  of  Fig.  3H). 

Fig.  2A  shows  the  NOESY  fingerprint  HN-H“  region  of  the 
Haitian  V3  loop  in  a  waterATFE  (7:3)  mixture  at  10  °C  for  250 
ms  of  mixing.  Although  the  cross-peaks  of  the  Haitian  V3  loop 
are  broader  in  the  mixed  solvent  than  in  the  aqueous  environ¬ 
ment,  we  are  able  to  assign  residues  5-36.  Fig.  2B  shows  the 
NOESY  HN-HN  region  of  the  Haitian  V3  loop  in  the  mixed 
solvent  at  10  °C  for  250  ms  of  mixing.  Due  to  the  broadness  of 
the  peaks  it  is  not  possible  to  decipher  the  interaction  of  two 
HN  protons  that  are  close  to  the  diagonal.  However,  quite  a 
number  of  distinct  HN-HN  cross-peaks  are  observed  in  this 
cross-section.  Fig.  3  summarizes  the  NOE  data  of  the  Haitian 
V3  loop  in  the  mixed  solvent  from  the  analyses  of  the  data  at  75 
and  250  ms  of  mixing.  The  NOE  data  of  the  Haitian  V3  loop  in 
the  mixed  solvent  is  distinct  from  that  in  water  in  the  following 
respects,  (i)  Medium  range  H“(i)-H^(i+3)  cross-peaks  are  ob¬ 
served  for  the  residue  pairs  (27,  30),  (31,  28),  (32,  29),  and  (33, 
30)  which  are  indicative  of  a  helical  core  spanning  residues 
27-33.  (ii)  Although  there  is  a  decrease  in  the  absolute  inten¬ 
sities  of  sequential  HN-H"  and  HN-HN  cross-peaks,  there  is  an 
enhancement  in  the  relative  HN-HN/HN-H"  cross-peak  inten¬ 
sities  for  residues  26-34,  which  is  again  indicative  of  a  helical 
structure  in  this  segment,  (iii)  A  few  sequential  H^-HN  cross¬ 
peaks  are  observed  in  this  stretch  that  are  either  weak  or 
absent  in  the  aqueous  solvent,  (iv)  Finally,  the  H“  protons  of 


follows:  t2  =  2048  data  points,  tl  =  1024  data  points,  relaxation  delay 
=  1.5  s,  number  of  transients  =  32.  The  same  acquisition  parameters 
were  used  for  the  DQF-COSY  experiment  except  for  t2,  which  was 
increased  to  4096  data  points.  Sequence  specific  assignments  (14)  were 
obtained  starting  from  Phe^^  (only  Phe  in  the  sequence)  and  moving 
backward  and  forward  along  the  connectivity  route  until  completion  of 
the  assignments.  Note  the  resonance  doubling  of  the  residue  Gly^®, 
indicating  a  conformational  equilibrium  between  the  two  forms.  How¬ 
ever,  no  additional  NOEs  to  discriminate  between  the  two  conforma¬ 
tions  were  observed. 
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Fig.  2.  NOESY  cross-sections  of  the  cyclic  Haitian  V3  loop  in 
water/TFE  (7:3)  mixture  (peptide  concentration  =  3.0  mM,  pH 
5.5,  temperature  =  10  °C).  The  HDO  signal  was  presaturated  for  1.2 
s  during  the  relaxation  delay.  Acquisition  parameters:  data  matrix  (t2 
=  2048  data  points,  tl  =  1024  data  points),  relaxation  delay  =  1.2  s, 
number  of  transients  =  32,  temperature  =  10  °C.  A,  the  fingerprint 
HN-H^  region.  J5,  the  HN-HN  region.  In  the  fingerprint  region  note  the 
double  population  of  the  residue  Gly^®.  One  of  the  populations  has  an 
additional  medium  range  NOE  between  H“-G18  and  HN-F21,  indicat¬ 
ing  a  conformational  equilibrium  between  two  forms,  one  extended  and 
one  bent  at  the  fragment  Gly^®-Lys^®-Ala^°-Phe^h  Interestingly  such  a 
double  population  for  Gly^®  is  also  observed  in  water,  although  the 
medium  range  NOE  with  F21  absent  in  the  aqueous  environment  (see 
Fig.  1).  Note  that  uniform  upheld  shift  of  the  resonances  in  the 
C-terminal  region  of  the  peptide  that  is  a  strong  indicator  of  the  helical 
conformation. 

the  residues  27-34  in  the  C  terminus  of  the  Haitian  V3  loop  are 
high  field  shifted  in  the  mixed  solvent  as  also  observed  in  the 
case  of  the  MN  V3  loop  (11). 

The  residues  in  the  neutralizing  epitope,  Ile^^-Pro^^-Met^®- 
Gly^®-Pro^^-Gly^®-Lys^^-Ala^®-Phe^^-Tyr^^,  are  unequivocally 
assigned  in  water  and  in  the  solvent.  Two  conformational 
states  of  Gly^®  (i.e.  Gly^®  and  Gly^®*)  are  clearly  evident.  Gly^® 
not  only  shows  two  populations  in  terms  of  the  chemical  shift 
values  of  (H“,HN)  but  also  in  terms  of  the  NOESY  connectivi¬ 
ties.  As  shown  in  Fig.  2A,  a  second  population  of  Gly^®*  is 


observed  in  mixed  populations;  Gly^®  shows  a  H“(G18)- 
HN(F21)  cross-peak  (Fig.  3).  The  chemical  shift  of  other  resi¬ 
dues  in  the  second  population  are  indistinguishable  from  the 
first.  Although  this  conformational  variant  is  also  present  in 
the  aqueous  solvent,  the  absence  of  the  H“(G18)-HN(F21) 
cross-peak  suggests  that  such  an  interaction  perhaps  is  not 
stabilized  in  a  polar  environment. 

NMR  Experiment  on  the  RF  V3  Loop — We  have  chosen  the 
RF  V3  loop  to  examine  the  effect  of  sequence  variation  on  the 
overall  folding  of  the  V3  loop  and  the  local  structure  at  the  GPG 
crest.  The  RF  V3  loop  is  different  from  the  Haitian  V3  loop  at 
eight  positions  and  is  more  positively  charged.  Fig.  4A  shows 
the  NOESY  fingerprint  HN-H“  region  of  the  RF  V3  loop  in 
water  at  10  for  400  ms  of  mixing.  Note  the  presence  of 
intra-residue  HN-H"  and  inter-residue  HN(i+l)-H"(i)  cross¬ 
peaks  except  for  the  residues  Cys^,  Cys^,  Thr^,  Arg^,  Pro^,  and 
Asn®,  possibly  due  to  inherent  flexibility  in  the  N-terminal 
region.  Fig.  45  shows  the  NOESY  HN-HN  region  of  the  RF  V3 
loop  in  water  at  10  °C  for  300  ms  of  mixing.  Although  a  set  of 
sequential  HN-HN  cross-peaks  are  observed  in  the  C-terminal 
stretch  (residues  29—34)  of  the  V3  loop,  no  medium  range 
NOEs  indicative  of  helical  fragments  are  seen.  Fig.  5  summa¬ 
rizes  the  NOE  data  of  the  RF  V3  loop  in  the  aqueous  environ¬ 
ment  from  the  analyses  of  the  data  at  200  and  400  ms  of 
mixing.  Only  intra-residue  and  sequential  NOEs  are  observed. 
The  c/hn-h"  coupling  constants  from  the  DQF-COSY  spectrum 
(Fig.  40)  provide  the  cp  values,  which  are  converted  into  HN-H" 
intra-residue  distances.^  The  lack  of  solubility  of  the  highly 
polar  RF  V3  loop  prevented  us  from  carr5dng  out  NMR  exper¬ 
iments  in  the  mixed  water/TFE  (7:3)  solvent. 

Structures  of  the  Haitian  V3  Loop — We  have  previously  re¬ 
ported  structural  studies  on  the  Thailand  and  MN  V3  loops  (10, 
11).  The  NMR  data  for  those  two  sequences  revealed  that  the 
structure  of  the  V3  loop  contained  a  few  reasonably  well  de¬ 
fined  secondary  structural  elements,  i.e.  a  GPGR(Q)  turn  and  a 
nascent  C-terminal  helix.  However,  the  V3  loops  are  consider¬ 
ably  flexible  within  the  constraints  of  these  secondary  struc¬ 
tural  elements  and  the  Cys^-Cys^®  disulfide  bridge.  Therefore, 
we  developed  and  applied  a  method  to  explore  the  extent  of 
conformational  flexibility  of  the  V3  loop  that  is  consistent  with 
the  NMR  data  (10,  11).  Briefly,  we  employ  the  following  steps, 
(i)  We  analyze  the  NMR  data  to  assign  secondary  structural 
states,  i.e.  ranges  of  ip  and  ip  values,  to  the  residues  of  the  V3 
loop.  The  NMR  data  for  the  aqueous  form  include  the  NOEs  in 
Fig.  3  and  the  DQF-COSY  data  (Fig.  1C);  the  c/hn-h”^  coupling 
constants  from  the  DQF-COSY  spectrum  provide  the  <p  values, 
which  are  converted  into  HN-H“  intra-residue  distances.^  Line 
broadening  in  the  mixed  solvent  prevented  us  from  obtaining  a 
high  quality  DQF-COSY  spectrum  in  the  water/TFE  (7:3)  mix¬ 
ture.  (ii)  We  obtain  a  set  of  starting  (energy  minimized)  struc¬ 
tures  of  the  V3  loop  subject  to  the  ip  and  ip  values  and  disulfide 
bridge  constraints,  (iii)  We  then  use  these  starting  structures 
for  Monte  Carlo  simulated  annealing  and  energy  minimiza¬ 
tions  for  sampling  the  conformational  space,  (iv)  We  finally 
select  a  set  of  low  energy  structures  (50  for  the  Haitian  V3  loop) 
and  analyze  the  conformational  parameters  to  examine  the 
nature  of  the  flexibility. 

Fig.  6  shows  the  ribbon  models  of  the  Haitian  V3  loop  in 
water  (left)  and  water/TFE  (7:3)  mixture  (right).  The  following 
color  coding  was  used  in  these  ribbon  diagrams:  gray  for  the 
N-terminal  protruding  loop  at  position  T3-R10,  green  for  the 
N-terminal  extended  /3-strand  flanking  the  GPG  crest,  ma¬ 
genta  for  the  central  /3-tum  at  position  G16-P17-G18-K19,  yel¬ 
low  for  the  C-terminal  extended  /3-strand  flanking  the  GPG 
crest,  and  blue  for  the  C-terminal  segment  D26-H35,  which  can 
form  an  a-helix.  In  water  the  C-terminal  segment  consists  of 
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B.  Medium  Range  NOEs 
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Fig  3  Summary  of  the  NMR  data  for  the  HIV  Haitian  V3  loop  in  water  (under  pane/  1)  and  in  water/TFE  (7:3)  mixed  solvent 
(under  pane/  2).  In  addition  to  sequential  H“-HN,  and  HN-HN  NOEs,  six  sequential  H^-HN  NOEs  were  obtained  in  both  the  solvents. 

The  sequential  H“-HN  connectivity  for  Pro  is  missing,  but  the  sequential  NOEs  provide  the  link.  Note  the  change  in  the 

sequential  NOE  pattern  in  the  C-terminal  segment;  in  the  mixed  solvent,  there  is  an  enhancement  of  the  sequential  HN-HN  N(DEs  relative  to  the 
corresponding  H“-HN  NOEs,  indicative  of  an  induction  of  a  helix.^  In  a  previous  work  (24-25),  the  NMR  assignments  of  the  V3  loops  of  MN  and 
IIIB  isolates  in  aqueous  solution  were  obtained.  The  authors  also  performed  CD  studies  on  TEE  mixed  solutions  but  did  not  succeed  in  assi^ing 
the  NMR  spectra  in  such  mixed  solvents.  In  another  NMR  work  (26),  the  authors  studied  two  24  linear  peptides  containing  the  neutralizing 
determinant  of  the  IIIB  isolate  of  HIV-1.  They  reported  (26)  the  presence  of  a  transient  turn  at  the  GPGR  crest,  and  the  ability  of  mixed  T^E 
solutions  to  induce  helical  conformation  in  the  C-terminal  domain  of  the  peptides,  whereas  in  water  only  a  “nascent  helix  formed  by  a  stretch  it 
interconnected  turns  was  observed. 


consecutive  j3-turns  centered  around  Ile^®-Gly^®  and  Ile^^- 
Arg^^,  whereas  in  the  water/TFE  (7:3)  mixture  it  further  folds 
into  a  well  defined  a-helix  as  evidenced  by  the  presence  of 
sequential  HN-HN  NOEs  (Fig.  2B)  and  medium  range  H^(i+3)- 
H"(i)  NOEs  (Fig.  SB,  panel  2).  Due  to  the  intrinsic  conforma¬ 
tional  flexibility  of  the  V3  loop,  side  chains  are  quite  mobile, 
and  they  do  not  sample  a  single  rotamer  conformation.  In  these 
representations  only  average  positions  for  the  side  chains  are 
shown.  However,  in  the  a-helical  region  for  the  mixed  solvent 
structure,  the  side  chains  are  organized  in  a  cylindrical  array 
as  experimentally  observed  by  the  presence  of  a  network  of 
d"^(i,i+3)  and  d^^  sequential  connectivities  (Fig.  35).  Nonethe¬ 
less,  in  both  the  structures  (Fig.  6)  the  neutralizing  epitope 
containing  the  central  GPGK  sequence  forms  a  protruding  loop 
even  though  the  local  structure  and  presentation  of  the  loop  in 
the  two  cases  are  noticeably  different.  The  aqueous  structure  of 
the  Haitian  V3  loop  in  Fig.  6  is  the  average  of  50  sampled 
configurations  that  exhibit  rms  deviations  below  1.5  A  with 
respect  to  the  backbone  atoms.  Out  of  50  sampled  structures  of 
the  Haitian  V3  loop  in  the  TFE/water  mixture,  a  small  subset 
of  six  structures  shows  a  large  (>2.6A)  rms  deviation  of  the 
backbone  atoms  from  the  rest  of  the  structures.  The  remaining 
44  structures  are  within  1.6  A  rms  deviations  of  the  backbone 
atoms.  The  average  structure  in  Fig.  6  is  taken  over  these  44 
structures. 

Fig.  7  shows  two  conformations  representing  the  aqueous 
environment  (left)  and  the  mixed  solvent  forms  {right).  The 
central  region  of  the  Haitian  V3  loop  containing  the  neutraliz¬ 
ing  determinant  residues  Ile^®-Pro^^-Met^®-Gly^®-Pro^^-Gly^®- 
Lys^^-Ala^^-Phe^^-Tyr^^  is  shown.  The  following  color  coding 


was  used  in  these  skeleton  models:  red  for  the  central  Gly^®- 
Pro^^-Gly^®-Lys^®  crest,  green  for  the  N-terminal  Ile^^-Pro^^- 
Met^^  residues  in  extended  conformation,  and  yellow  for  the 
C-terminal  Ala^°-Phe^^-Tyr^^  residues,  which  show  a  solvent- 
induced  effect.  In  the  water  structure  the  C-terminal  fragment 
is  in  an  extended  conformation  (open  state).  In  the  mixed 
solvent,  two  types  of  chain  folding  are  observed:  one  folded  form 
is  similar  to  that  of  the  MN-V3  loop  (11),  whereas  in  the  other, 
the  GPG  crest  forms  the  typical  type  II  )3-turn  followed  by  a 
type  III  /3-turn  involving  residues  Gly^®-Arg^®-Ala^®-Phe^\  as 
evidenced  by  the  presence  of  a  medium  range  NOE  between 
H'^-GIS  and  HN-F21  (closed  state).  Such  an  S-shaped  confor¬ 
mation  has  been  previously  reported  for  a  peptide  containing 
the  V3  neutralizing  determinant  complexed  to  an  antibody 
(18),  and  it  will  be  referred  as  “arched”  conformation  for  the 
rest  of  the  paper.  Our  NMR  data  (Figs.  lA  and  3)  clearly 
indicated  that  these  two  states  are  simultaneously  present  in 
mixed  solvent,  whereas  only  the  open  state  exists  in  aqueous 
solutions.  We  have  not  shown  the  open  state  structure  of  the 
Haitian  V3  loop  in  the  mixed  solvent  because  it  closely  resem¬ 
bles  the  already  published  structure  of  the  MN-V3  loop  (11). 

Structure  of  the  RF  V3  Loop — Fig.  8  shows  the  ribbon  dia¬ 
gram  for  the  structure  of  the  RF  V3  loop  in  water.  Here,  again 
the  conformational  analysis  is  done  for  50  sampled  low  energy 
structures.  All  the  sampled  structures  showed  rms  deviations 
of  0.22  ±  0.02  A  with  respect  to  79  independent  distance 
constraints.  The  same  color  coding  described  in  Fig.  6  was  used 
here.  The  average  structural  features  of  the  V3-RF  in  water 
resemble  those  observed  for  the  Haitian  and  the  MN  V3  loops 
(11);  however,  the  absence  of  any  nonsequential  NOE  suggests 
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Fig.  4.  NOESY  (mixing  time  =  400  ms)  and  DQF-COSY  cross- 
sections  of  the  cyclic  RF  V3  loop  in  water  (peptide  concentra¬ 
tion  =  5.0  mM,  pH  5.5),  A,  the  fingerprint  HN-H^  region.  B,  the 
HN-HN  region.  C,  DQF-COSY  cross-sections  of  the  cyclic  RF  V3  loop. 
For  NOESY  experiments,  the  acquisition  parameters  were  as  follows:  t2 


that  the  RF  V3  loop  structure  is  considerably  more  flexible 
than  the  Haiti-V3  and  MN-V3  loops.  Out  of  50  sampled  struc¬ 
tures,  a  small  subset  of  eight  structures  shows  a  large  {>2/1  A) 
rms  deviations  of  the  backbone  atoms  from  the  rest  of  the 
structures.  The  remaining  42  structures  are  within  1.7  A  rms 
deviations  of  the  backbone  atoms.  The  average  structure  in  Fig. 
8  is  taken  over  these  42  structures. 

DISCUSSION 

Previous  NMR  studies  on  the  Thailand  and  MN  V3  loops  (10, 
11)  and  the  current  work  on  the  Haiti  and  RF  V3  loops  (Figs. 
6-8)  can  be  summarized  as  follows:  (i)  A  GPG-turn  at  the  crest 
of  the  V3  loop  is  present  in  all  the  four  sequences,  (ii)  Stretches 
of  j3-strand  adjacent  to  the  GPG-turn  on  the  N-  and  C-terminal 
sides  are  common  to  all  the  four  sequences,  (iii)  The  residues  in 
the  C-terminal  segment  form  a  few  turns  in  water  and  a  helix 
in  the  less  polar  mixed  solvent,  (iv)  In  spite  of  the  constraints  of 
secondary  structures  ((i)-(iii))  and  the  disulfide  bridge,  the  V3 
loop  exhibits  conformational  flexibility  as  evidenced  by  the 
absence  of  long  range  NOESY  interactions  commonly  observed 
in  well  folded  globular  proteins  (14). 

However,  a  “protruding  knob”  formed  by  the  central  GPG 
turn  and  the  /3-strands  on  either  side  emerges  as  the  secondary 
structural  feature  conserved  among  diverse  V3  loop  sequences. 
The  single  crystal  structure  of  the  HIV-1  neutralizing  antibody 
(monoclonal  antibody  50.1)  complexed  to  16-residue-long  linear 
MN-V3  fragment  shows  the  hint  of  such  a  protruding  knob, 
although  the  segment  on  the  C-terminal  side  of  the  GPGR  type 
II  turn  remains  disordered  (18).  The  crystallographic  observa¬ 
tion  suggests  that  the  protruding  knob  of  the  V3  loop  that 
includes  the  neutralizing  epitope  might  well  be  specifically 
recognized  by  the  antibody.  However,  the  conserved  protruding 
knob  of  the  V3  loop  need  not  always  be  presented  in  its  confor- 
mationally  pure  form  because  HIV  may  find  a  way  to  mask  this 
conserved  secondary  structural  element.  In  this  work  we  report 
one  such  mechanism  of  masking  as  revealed  by  the  closed  state 
in  Fig.  7.  In  this  form  of  the  Haitian  V3  loop,  the  NMR  data 
indicate  an  arching  of  the  residues  on  the  C-terminal  side  of  the 
GPGK  turn.  This  is  a  departure  from  the  protruding  knob  motif 
that  contains  the  central  GPG  turn  and  two  )3-strands  on  either 
side.  Such  an  arched  conformation  of  the  neutralizing  epitope 
has  also  been  observed  in  an  antibody  (monoclonal  antibody 
59.1)  complexed  with  a  linear  V3  fragment  (18). 

When  combined  with  the  single  crystal  data  (17,  18),  our 
NMR  data  (Refs.  10  and  11  and  this  work)  indicate  that  the 
closed  or  arched  conformation  of  the  neutralizing  epitope  of  the 
V3  loop  is  possible  and  can  be  recognized  by  the  antibody.  In 
addition,  our  data  also  indicate  that  an  equilibrium  between 
the  closed  and  open  state  (Fig.  7)  is  possible  for  the  same  V3 
loop  sequence.  The  arching  around  Ala^^-Phe^^  tends  to  mask 
Lys^®  and  Ala^^  (Fig.  7).  The  closed  form  of  the  V3  loop  may 
camouflage  some  essential  elements  of  the  neutralizing  epitope 
from  the  immune  system.  For  instance,  this  masking  will  in¬ 
terfere  with  the  binding  of  antibodies  (8,  19)  that  recognize  the 
PGRAF  epitope. 

Most  importantly  such  a  local  masking  of  Ala^^  and  Phe^^ 
should  affect  the  proteolysis  of  the  (Arg/GlnyLys)^^-Ala^®  pep¬ 
tide  bond  by  thrombin  and  tryptase  (20, 21);  the  second  enzyme 


=  2048  data  points,  tl  =  1024  data  points,  relaxation  delay  =  1.5  s, 
number  of  transients  =  32.  Same  acquisition  parameters  were  used  for 
the  DQF-COSY  experiment  except  for  t2,  which  was  increased  to  4  K. 
Sequence-specific  assignments  were  obtained  starting  from  VaP*^  (only 
Val  in  the  sequence)  and  moving  backward  and  forward  along  the 
connectivity  route  until  completion  of  the  assignments.  Assignments  in 
the  fragment  Cys^-Asn'^  were  not  possible,  presumably  due  to  the  in¬ 
trinsic  flexibility  in  the  region  Asn®-Asn'^-Asn®. 
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Fig  5  Summary  of  the  NMR  data  for  the  HIV  RF  V3  loop  in  water.  Sequential  and  HN-HN  NOEs.  The  sequential  H^-HN 

connectivity  for  Pro  is  missing,  but  the  sequential  NOEs  provide  the  NOE  link.^  Lorimier  et  al  (26)  studied  a  ^O-residue-long  peptide 

containing  a  T-helper  epitope  16  residue  long  in  the  N-terminal  region  and  a  24-residue-long  segment  derived  from  the  VS  loop  of  the  HiV-Ki^ 
isolate  studied  in  this  work.  In  the  V3  loop  region  the  authors  observed  the  GPGR  turn,  whether  the  rest  of  the  C-terminal  fra^ent  was 
disordered.  However,  we  would  like  to  stress  that  in  the  study  of  the  linear  peptides  (24,  25),  the  authors  did  not  consider  the  importance  of  the 
disulfide  bridge  locking  the  V3  fragment  in  a  closed  loop.  We  have  previously  shown  that  the  cyclic  V3  loops  are  better  ligands  for  V3-specilic 
antibodies  (11). 


Water  Water/TFE  mixture 

Fig.  6.  The  ribbon  diagrams  describe  representative  folding  patterns  for  the  structures  of  the  Haitian  V3  loop  in  water  (left)  arid 
in  mixed  water/TFE  solvent  {right).  The  following  color  coding  was  used  in  these  ribbon  diagrams:  gray  for  the  N-terminal  protruding  loop, 
green  for  the  N-terminal  extended  jS-strand  flanking  the  GPG  crest,  magenta  for  the  central  jS-turn  at  the  GPG  crest,  yellow  for  the  C-terminal 
extended  /3-strand  flanking  the  GPG  crest,  and  blue  for  the  C-terminal  segment,  which  can  form  an  a-helix.  In  each  case,  the  average  is  done  oyer 
50  sampled  low  energy  structures.  Ribbon  models  in  the  two  cases  correspond  to  the  average  structure.  All  the  sampled  structures  of  the  Haitian 
V3  loop  in  water  showed  rms  deviations  of  0.24  ±  0.02  A  with  respect  to  95  independent  distance  constraints.  All  the  sampled  structures  of  the 
Haitian  V3  loop  in  mixed  waterATFE  solvent  showed  rms  deviations  of  0.27  ±  0.02  A  with  respect  to  123  independent  distance  constraints.  The 
structures  of  the  Haitian  V3  loop  in  water  show  a  greater  degree  of  flexibility  than  those  in  the  mixed  water/TFE  solvent;  this  is  due  to  the 
formation  of  the  C-terminal  helix  in  the  mixed  solvent. 


Fig.  7.  Two  conformations  repre¬ 
senting  the  aqueous  environment 
{left)  and  mixed  solvent  forms  {right). 
The  central  region  of  the  Haitian  V3  loop 
containing  the  neutralizing  determinant 
residues  Ile^^-Pro^'*-Met^^-Gly^®-Pro^’^- 
Gly’^^-Lys^^-Ala^^-Phe^^-Tyr^^  are  shown. 
The  following  color  coding  was  used  in 
these  skeleton  models:  green  for  the  N- 
terminal  Ile^^-Pro^^-Met^®  residues  in  ex¬ 
tended  conformation,  red  for  the  central 
Gly^®-Pro^’^-Gly^®-Lys^®  crest,  and  yellow 
for  the  C-terminal  Ala^^-Phe^^-Tyr^^  res¬ 
idues,  which  show  solvent  induced  arch¬ 
ing  effect.  Solvent-accessible  areas  were 
calculated  using  the  Molecular  Surface 
Package  due  to^Connolly  (27)  with  a  probe 
radius  of  1.5  A.  In  the  fragment  Lys^®- 
Phe^^,  the  aqueous  structure  has  a  lower 
surface  accessibility  than  the  structure  in 
the  mixed  TFE/water  solvent. 


Open  Closed 
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Fig.  8.  The  ribbon  diagrams  describe  representative  folding 
pattern  of  the  RF  V3  loop  in  water.  The  same  color  coding  described 
in  the  legend  to  Fig.  6  was  used  here.  The  structural  features  of  the 
V3-RF  in  water  resemble  those  observed  for  the  Haitian  and  the  MN  V3 
loops  in  water  (11). 

lies  on  the  T-cell  surface.  When  gpl20  is  used  as  a  substrate 
these  two  enzymes  show  exceptional  specificity  for  cleavage  of 
the  (Arg/Gln/Lys)^^-Ala^®  peptide  bond  inside  the  V3  loop.  Most 
striking  is  the  observation  that  the  V3  loops  of  T-cell  tropic 
virus  strains  are  1,000  times  more  susceptible  to  cleavage  by 
these  two  enzymes  than  the  V3  loops  of  macrophage  tropic 
strains  (21).  The  T-cell  tropic  V3  loops  are  more  positively 
charged  than  the  macrophage  tropic  V3  loops  (21-23).  Our 
studies  reveal  that  the  open  state  of  the  neutralizing  epitope  of 
the  V3  loop  is  exclusively  preferred  for  MN  and  RF  V3  loops 
with  net  charges  of  more  than  +5,  whereas  the  closed  state  of 
the  neutralizing  epitope  begins  to  appear  for  the  Haitian  V3 
loop  with  a  net  charge  of  +3.  Therefore,  the  proteolysis  data 
(21)  are  consistent  with  our  structural  conclusions. 
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Studies  of  the  feasibility  of  a  subunit  vaccine  to  pro¬ 
tect  against  human  immunodeficiency  virus  (HIV)  infec¬ 
tion  have  principally  focused  on  the  third  variable  (V3) 
loop.  The  principal  neutralizing  determinant  (PND)  of 
HIV-1  is  located  inside  the  V3  loop  of  the  surface  enve¬ 
lope  glycoprotein,  gpl20.  However,  progress  toward  a 
PND-based  vaccine  has  been  impeded  by  the  amino  acid 
sequence  variability  in  the  V3  loops  of  different  HIV 
isolates.  Theoretical  studies  revealed  that  the  variabil¬ 
ity  in  sequence  and  structure  of  the  V3  loop  is  confined 
to  the  N-  and  C-terminal  sides  of  the  conserved  GPG 
crest.  This  leaves  three  regions  of  the  V3  loop  conserved 
both  in  sequence  and  secondary  structure.  We  present 
the  results  of  NMR  studies  that  test  the  validity  of  our 
theoretical  predictions.  Structural  studies  are  reported 
for  the  HIV-V3  loop  (HIV-MN)  in  the  linear  and  cyclic 
(S-S-bridged)  forms.  For  the  V3  loop  sequence  of  the 
HIV-MN  isolate,  the  three  conserved  secondary  struc¬ 
tural  elements  are  as  underlined  below: 

turns  turn  helix 

CTRPNY]>JlCRKRIHIGPGRAFYTTKNIIGTIROAHC 

Finally,  the  conformational  requirement  of  the  PND  in 
the  V3  loop-antibody  interaction  is  tested  by  monitoring 
the  monoclonal  antibody  binding  to  the  HIV-MN  V3  loop 
in  the  linear  and  cyclic  forms  by  enzyme-linked  immu¬ 
nosorbent  assay.  The  binding  data  reveal  that  the  cyclic 
V3  loop  is  a  better  ligand  for  the  monoclonal  antibodies 
than  the  linear  form  although  the  latter  has  the  same 
sequence.  This  means  that  the  monoclonal  antibodies 
recognize  the  PNDs  as  conformational  epitopes. 


The  surface  of  human  immunodeficiency  virus  (HIV)^  is 
studded  with  several  copies  of  the  glycoprotein  gpl20  (1).  A 
segment  of  gpl20  is  being  considered  as  a  potential  antigenic 
target  for  protective  humoral  immunity.  This  segment  is  lo¬ 
cated  inside  the  third  variable  loop,  called  the  V3  loop  (Fig.  1). 
Monoclonal  antibodies  raised  against  the  V3  loop  can  neutral¬ 
ize  the  viral  infection  by  specifically  binding  to  the  principal 


neutralizing  determinant  (PND)  located  inside  the  V3  loop 
(2-4).  However,  the  HIV  V3  loop  undergoes  sequence  muta¬ 
tions  at  a  rapid  rate  in  order  to  escape  the  immune  surveillance 
of  the  host  cell.  But  certain  segments  of  the  V3  loop  remain 
fairly  conserved  among  different  HIV  isolates;  these  conserved 
segments  of  the  V3  loop  are  probably  essential  in  the  life-cycle 
of  the  virus.  Therefore,  the  virus  mutates  the  V3  loop  only 
enough  to  escape  the  immune  pressure  without  risking  its  own 
life  inside  the  host  cell.  The  elusive  nature  of  the  V3  loop  calls 
attention  toward  two  important  structural  aspects  in  relation 
to  V3  loop-antibody  interaction:  (i)  the  global  tertiary  folding  of 
the  V3  loop  and  (ii)  the  structure  and  presentation  of  the  PND. 
This  article  describes  the  results  of  a  study  aimed  at  exploring 
these  structural  aspects  by  combining  two-dimensional  NMR 
spectroscopy,  molecular  modeling,  and  antibody  binding  meas¬ 
urements  of  the  V3  loop  from  the  HIV  isolate  from  Minnesota 
(HIV-MN). 

EXPERIMENTAL  PROCEDURES 
Materials 

The  linear  and  cyclic  HIV-MN  V3  loops  were  obtained  from  Peptide 
and  Protein  Research  Consultants,  Washington  Singer  Laboratories, 
UK,  using  an  NIH  reagent  contract.  The  purity  of  the  linear  V3  loop 
(TRPNYNKRKRIHIGPGRAFYTTKNIIGTIRQAH)was  -99%  by  high 
pressure  liquid  chromatography;  a  major  peak  at  3878.8  was  obtained 
by  fast  atom  bombardment  mass  spectroscopy  which  agrees  with  the 
average  molecular  weight  {Mf)  of  3878,5  as  required  by  the  desired 
structure.  The  cyclic  form  (CTRPNYNKRKRIHIGPGRAFYTTKNI- 
IGTIRQAHC)  showed  purity  of  [SIM]95%,  i.e.  a  single  major  peak  with 
a  few  minor  impurities.  A  major  fast  atom  bombardment  mass  spec¬ 
troscopy  peak  at  4083.1  corresponded  well  with  the  of  4082.7  as 
required  by  the  desired  structure. 

Structure  Determination:  NMR  and  Modeling 

The  methodology  for  structure  determination  consisted  of  three 
steps:  steps  1  and  2  for  theoretical  analyses  of  the  structure  and  flexi¬ 
bility  of  the  V3  loop  and  step  3  for  experimental  verification  of  theoret¬ 
ical  predictions  by  NMR  spectroscopy. 

Step  1:  Prediction  of  Secondary  Structures— The  secondary  struc¬ 
tural  elements  were  predicted  for  a  V3  loop  sequence  by  computing  the 
probability  S  of  a  given  residue  i  in  the  V3  loop  to  adopt  a  ^-type  of 
conformation  {k  =  helix,  h,  jS  sheet,  6,  coil,  c,  or  turn,  t),  where: 


*  This  work  was  supported  by  National  Institutes  of  Health  Grant 
ROl  AI32891-01A2.  The  NMR  work  was  done  at  the  NMR  facility  at  the 
University  of  California,  Davis,  using  the  GE  500  MHz  spectrometer 
(funded  by  National  Science  Foundation  Grant  DIR-88-04739  and 
United  States  Public  Health  Service  Grant  RR04795).  The  costs  of 
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nuclear  Overhauser  effect;  TFE,  2,2,2-trifluoroethanol;  NOESY,  nu¬ 
clear  Overhauser  and  exchange  spectroscopy. 


Sik,i)  = 


P{k,  i  1) 
^  \l\+l~ 

l=-y 


(The  summation  is  over  1  =  -7  to  7,  where  7  =  size  of  the  window 
chosen  to  accoimt  for  the  effect  of  the  neighboring  amino  acid  residues: 
7  =  5  for  A;  7  =  3  for  b;  and  7  =  4  for  c  or  t).  P(k,  i)  =  potential  for  the 
^-type  of  conformation  of  individual  residue  i  derived  from  the  analysis 
of  the  single  crystal  structures  of  about  65  proteins.  The  highest  S{k,  i) 
determines  the  conformation  k  for  the  i  residue  (5).  Use  of  any  existing 
algorithm  for  secondary  structure  prediction  is  only  60%  accurate.  In 
order  to  improve  accuracy,  we  tested  our  predictions  by  requiring  an 
S-S  bridge  formation  that  achieves  local  energy  minima  for  the  cyclic 
V3  loop;  this  led  to  step  2  in  our  method. 

Step  2:  Generation  of  Energy -minimized  S-S-bridged  V3  Loop — ^This 
step  involved  obtaining  an  energetically  stable  S-S-bridged  structure 
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Fig  1.  Analyses  of  the  amino  acid  sequence  and  the  conserved  secondary  structural  elements  of  V3  loop  sequences.  A  molecular 
modeling  method  (9-11)  is  used  to  define  secondary  structural  states  of  relatively  conserved  and  highly  variable  regions  of  V3  loop  sequences,  and 
to  predict  the  resulting  energetically  stable  tertiary  fold(s)  of  the  S-S-bridged  V3  loop.  We  found  (11)  that,  regardless  of  the  neighboring  amino 
acids,  the  boxed  regions  retained  the  same  secondary  structural  elements,  i.e.  (i)  a  loop  with  two  consecutive  ^-tums  at  the  N-terminal  segment, 
(ii)  a  type  II  /3-tum  at  the  GPG-crest,  and  (iii)  a  C-terminal  helix.  Both  the  North  American  consensus  V3  loop  and  the  HIV  V3  loop  isolated  from 
a  Minnesota  patient  (HIV-MN)  showed  the  same  secondary  structural  elements  for  the  three  conserved  regions  of  the  V3  loop.  A,  the  most  common 
amino  acid  found  in  each  site  is  shown  on  the  top  row;  this  row  corresponds  to  the  North  American  consensus  V3  loop  sequence.  The  percent 
frequency  with  which  an  amino  acid  occurs  at  each  site  is  shown  directly  below,  and  beneath  each  percent  frequency,  a  column  of  the  ammo  acids 
that  can  occupy  each  position  is  listed  in  descending  order  of  their  percent  frequency  at  the  given  site.  (This  figure  is  adapted  from  Roterman  ef 
al.  (7).)  The  conserved  regions  are  numbered  1  through  3.  B,  the  amino  acid  sequence  and  the  conserved  secondary  structural  elements  in  the 
HIV-MN  V3  loop. 


for  a  V3  loop  sequence  given  the  secondary  structural  states  of  the 
constituent  amino  acids  residues  as  obtained  after  step  1  or  from 
analyses  of  two-dimensional  NMR  data  (discussed  later).  Appropriate 
ranges  of  (<p,  values  were  assigned  to  all  amino  acids.  For  example, 


(p  = 

-  55°  ±  25°, 

(p  = 

-  140°  ±  30° 

<pi 

+ 

-  65°  ±  20°, 

ifd 

<pi 

+ 

2  = 

-  90°  ±  20°, 

ipi 

(pi 

+ 

1  = 

-  65°  ±  20°, 

ifd 

<pi 

+ 

2  = 

90°  ±  20°, 

if/i 

=  -  55°  ±  25°  (for  residues  in  helix) 
if/  =  140°  ±  30°  (for  residues  in  a  (3  strand) 
+  1  =  -  50°  ±  20° 

4*  2  =  0°  ±  20°  (for  residues  in  a  type-I  turn) 
+  1  =  120°  ±  20° 

+  1  =  0°  ±  20°  (for  residues  in  a  type-II  turn) 


((p,  if/)  of  residues  in  the  coil  state  were  set  free  to  choose  any  point  in  the 
allowed  space  (for  definitions  of  different  secondary  structures  and 
corresponding  ((p,  if/)  values,  see  Ramachandran  and  Sasisekharan  (6)). 
We  simplified  the  sequence  by  assuming  Ala  for  residues  with  side 
chains  extending  beyond  the  atom,  except  for  the  Pro  and  the 
terminal  Cys.  Our  rationale  for  doing  this  was  that  the  allowed  ((p,  if/) 
space  of  residues  with  a  side  chain  longer  than  Ala  is  only  a  subspace  of 
that  allowed  for  Ala  (6). 

We  obtained  an  S-S-bridged  structure  of  a  V3  loop  by  using  a  linked- 
atom-least-square  refinement  equation  that  minimizes  function  F  in 
the  space  (<p,  if/): 


F='ZihGi+'Zy(dr-^"'")' 

where  (=  [vert]  -  r?  [vert]  =  0)  indicates  distance  constraints  for 
an  S-S  bridge.  Distances  in  the  S-S-bridged  V3  loop  configuration  are 
defined  as  =  S(C1)  -  S(35),  =  0^(01)  -  S(C35),  rg  =  mC35)  - 

S(C1),  and  r4  =  C^(Cl)  -  C^(C35);  corresponding  equilibrium  distances 
are  r?  =  2.04  A,  =  3.05  A,  =  3.85  A  (7).  indicates  Lagrangian 

multipliers;  indicates  distance  between  atom  i  (type  m)  and  atom  j 
(type  n);  and  D”""  indicates  the  contact  limit  between  atom  (type  m)  and 
atom  (type  n)  (6).  In  this  refinement  (<p,  if/)  of  various  residues  were 
treated  as  elastic  variables  (i.e.  variables  with  weights)  such  that  by 
appropriate  choice  of  weights  the  predicted  secondary  structural  states 
of  residues  (after  step  1)  were  minimally  altered  (8,  9).  This  method 
guarantees  a  stereochemically  orthodox  structure  for  the  S-S-bridged 
(CAggO-like  sequence.  Finally,  appropriate  side  chains  were  attached 


to  generate  an  actual  V3  loop  sequence  and  the  potential  energy  of  the 
system  was  minimized  in  the  (cp,  if/,  o),  ;^)-space  using  the  force-field  of 
Sippl  et  al  as  cited  in  Ref.  11.  Several  energy-minimized  structures  of 
a  given  V3  loop  sequence  were  obtained  by  choosing  a  number  starting 
structures  within  the  specified  ranges  of  (cp,i/^)  values  predicted  in  step 
1.  Conformational  sampling  of  a  V3  loop  sequence  belonging  to  a  given 
family  of  tertiary  fold  was  performed  by  Monte  Carlo  simulated  anneal¬ 
ing  (9).  If  the  secondary  structural  states  of  one  or  more  residues  as 
predicted  in  step  1  were  energetically  unfavorable  for  the  cyclic  V3  loop, 
those  states  were  altered  in  step  2.  Even  though  wrongly  predicted 
secondary  structural  states  of  a  residue  by  step  1  was  corrected  in  this 
step  using  the  energy  criteria  of  a  cyclic  V3  loop,  it  was  necessary  to 
examine  which  secondary  structural  states  of  the  residues  in  the  V3 
loop  were  predominantly  present  in  solution.  This  led  to  step  3  of  our 
methodology. 

Step  3:  Use  of  Two-dimensional  NMR  Experiments— This  step  in¬ 
volved  (i)  sequential  assignment  of  the  protons  belonging  to  constituent 
amino  acid  residues,  (ii)  extraction  of  sequential  and  medium-range 
inter-residue  interactions  by  employing  full-matrix  NOESY  simula¬ 
tions  with  respect  to  observed  NOESY  data  (10),  and  (iii)  conforma¬ 
tional  sampling  by  Monte  Carlo  simulated  annealing  subject  to  the 
distance  constraints  derived  from  NOESY  data  (11).  Two-dimensional 
NMR  experiments  were  conducted  in  90%  H2O,  10%  D2O  and  in  100% 
D2O  under  the  following  solution  conditions:  peptide  concentration  = 
1-3.5  mM,  pH  =  5.5  in  phosphate  buffer,  temperature  =  6-25  °C.  NMR 
experiments  were  done  over  a  wide  range  of  peptide  concentrations  to 
examine  whether  there  were  complications  due  to  inter-molecular  as¬ 
sociations.  Analyses  of  the  results  of  total  correlation  spectroscopy, 
double  quantum  filtered  correlation  spectroscopy,  and  NOESY  (at  two 
mixing  times)  experiments  in  90%  HgO,  10%  D2O  led  to  the  sequential 
assignment  of  the  spin  system  (HN,  H“,  H'^)  and  also  to  the  identifica¬ 
tion  of  secondary  structural  states  of  various  residues  in  the  V3  loop 
(12).  Additional  NMR  data  in  DgO  further  confirmed  sequential  assign¬ 
ment  of  all  non-exchangeable  protons.  Prominent  secondary  structural 
elements  emerged  from  the  characteristic  NOE  pattern  present  in  the 
two-dimensional  NMR  data.  In  addition  to  the  characteristic  structural 
features  {i.e.  the  presence  of  j3-strand,  a  turn  or  a  helix)  a  complete  set 
of  structural  constraints  were  derived  from  two-dimensional  NMR  data: 
(p  from  c7hn-h«  coupling  and  inter-residue  HN-HN;  H“-HN,  H^-HN 
distances  from  two-dimensional  NMR  experiments  in  90%  H2O,  10% 
D2O;  from  h18>  coupling  and  intra-residue 

H“H'^,  HNH^;  and  inter-residue  H“H°‘  and  H^H^  distances  from  two- 
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Fig.  2.  NOESY  cross>sections  of  the 
cyclic  MN  V3  loop  in  water  (peptide 
concentration  =  3.5  mM;  pH  4,5).  The 
pulse-sequence  due  to  Sklenar-Bax  (24) 
was  used  for  solvent  suppression.  Acqui¬ 
sition  parameters:  data  matrix  (^2  =  2K, 
=  IK),  relaxation  delay  =  1.5  s,  number 
of  transients  ^  32,  temperature  ^  10  °C. 
A,  the  fingerprint  HN-H“  region;  B,  the 
HN-HN  region.  Sequence  specific  assign¬ 
ments  (25)  were  obtained  starting  from 
F20,  which  represents  a  unique  residue  in 
the  sequence,  and  moving  backward  and 
forward  along  the  connectivity  route  until 
completion  of  the  assignments. 


8.8  8.6  8.4  8.2 

Dl  (ppm) 


dimensional  NMR  experiments  of  the  V3  loop  in  DgO.  All  these  struc¬ 
tural  constraints  were  used  for  structure  determination.  The  structure 
determination  consisted  of  two  steps:  (i)  extraction  of  inter-proton  dis¬ 
tances  and  (ii)  incorporation  of  these  distance  constraints  for  obtaining 
a  cluster  of  structures  in  agreement  with  the  NMR  data.  Full-Matrix 
NOESY  simulations  with  respect  to  experimental  data  at  two  mixing 


times  (150  and  300  msec)  enabled  us  to  include  both  primary  and  higher 
orders  of  NOEs.  Thus,  the  complications  in  the  distance  estimate  using 
a  two-spin  model  often  encountered  at  a  high  mixing  time  due  to 
spin-diffusion  (Le.  higher  order  NOEs)  are  avoided  in  the  full-matrix 
NOESY  simulations  where  all  spins  are  considered  in  the  relaxation 
(10).  Such  a  simulation  at  two  mixing  times  improves  the  rigor  in 
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R.  Sequential  NQEs 
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B.  Medium  Range  NOEs 

- 1 - 

h“n(25,28)Q:H°'N(30,33)G;  H“N  (23,26) Q 

h“N(26,29)DH'^N  (12,14)  □ 


_ 2 - 

H“N(24,27)n;  H“N(30,34)n;  h“n(30,33)D 

H“N(29,32)Q;  h''N(18,21)Q;  hW(21,18)Q 
H'^H“(32,29)J:H‘^H°'(26,23)n;  H^H'^(26,30)n 
hPh“(33,30) Jj;  H'H“(23,26)n;  h'h“(23,27) Q 

H^h“  (30, 28)0;  H®h“ (30,28)0 


Fig.  3.  Summary  of  the  NMR  data  for  the  HIV-MN  V3  loop  in  water  and  in  water/TFE  (7:3)  mixed  solvent.  Nomenclature  of  various 
sequential  and  medium  range  NOEs  are  taken  from  Wuthrich  et  aL  (12).  In  addition  to  sequential  H“-HN,  H^-HN,  and  HN-HN  NOEs  about  10 
sequential  H^-HN  NOEs  were  obtained  in  both  the  solvents.  The  sequential  H“-HN  connectivity  for  Pro  is  missing  but  the  sequential  NOEs 
provide  the  NOE  link.  Note  the  solvent-induced  change  in  the  sequential  NOE  pattern  in  the  C-terminal  segment;  in  the  mixed  solvent,  there  is 
an  enhancement  of  the  sequential  HN-HN  NOEs  relative  to  the  corresponding  H“-HN  NOEs,  indicative  of  an  induction  of  a  helix. 


estimating  structural  constraints  for  pairwise  inter-proton  interactions, 
i.e.  for  each  constraint,  an  upper  and  a  lower  limit  of  the  distance.  Two 
types  of  constraints  are  identified  (ll). 

Type  I  is  given  as 

EDIST  =  0 

if  the  distance  r  is  within  a  specified  range  (between  rl  and  r2) 
=  k{r  -  rl)^  if  r  <  rl 
=  k{r  -  r2)^  if  r  >  r2.  k:  force  constant. 

Type  II  is  given  as 

EDIST  =  Oifr  >  rl 

=  k{r  -  rlf  if  r  <  rl 

This  type  is  particularly  useful  for  an  unobserved  NOE  where  we  can 
set  a  lowest  allowable  distance  limit  for  the  corresponding  proton  pair. 
The  (p-angle  constraints  are  also  included  as  1-4  distance  constraints. 

The  energy  term,  EDIST,  is  added  to  the  force -field  QCEP  454  due  to 
Scheraga  and  co-workers  (7).  The  simulated  annealing  is  performed  in 
the  following  manner.  First,  a  starting  energy-minimized  structure  is 
chosen  and  Monte  Carlo  simulations  are  performed  for  50,000  steps  at 
lOOOK  in  the  (cp,  iff,  to,  ;^)-space;  the  last  accepted  configuration  is  stored 
to  be  subsequently  used  as  a  starting  configuration  in  the  next  lower 
temperature-cycle.  Second,  50,000  Mone  Carlo  steps  are  repeated  in 
several  cycles  of  gradually  decreasing  temperature  until  a  temperature 
of  lOOK  is  reached.  Third,  the  lowest  energy  configuration  at  lOOK 
is  further  energy-minimized  to  a  low  energy  gradient.  This  is  the 
“temperature  quenching”  step  in  which  thermally  excited  single  bond 
rotations  around  the  equilibrium  positions  are  quenched.  Finally, 
first  through  third  steps  are  repeated  for  20  different  starting 
configurations. 

All  sampled  low  energy  structures  are  analyzed  to  define  the  extent 
of  conformational  variability.  Although  20  starting  structures  chosen 
for  Monte  Carlo  simulation  are  conformationally  different,  they  are  not 
included  in  the  analyses  for  conformational  flexibility.  Because  these 
structures  obtained  by  NMR  pattern  recognition  followed  by  energy 
minimization  do  not  adequately  define  the  population  density  of  the 
energy  basins  to  which  they  belong.  However,  after  simulated  anneal¬ 


ing  energy  barriers  are  crossed  and  different  energy  basins  are  visited 
sufficient  number  of  times.  Therefore,  after  such  a  sampling  analyses  of 
conformational  variants  become  physically  meaningful. 

Solid  Phase  Peptide  Enzyme-linked  Immunosorbent 
Assay  with  Monoclonal  Antibodies 

Peptides  (0.5  jag/ml)  were  bound  to  Dynatech  Immulon  IV  96  well 
plates  (Chantilly,  VA)  by  overnight  incubation  in  0.05  M  Bicarbonate 
buffer.  The  remaining  protein  binding  sites  were  blocked  after  one  hour 
of  room  temperature  incubation  in  5%  Carnation  nonfat  dry  milk  in 
phosphate-buffered  saline  at  pH  7.4.  The  plates  were  then  incubated 
with  50  jal  of  the  appropriately  diluted  monoclonal  antibodies  for  1  h  at 
room  temperature.  The  plates  were  then  washed  three  times  with 
phosphate-buffered  saline.  This  was  followed  by  a  1-h  incubation  with 
50  pi  of  the  secondary  antibody  consisting  of  Sigma  goat  anti-mouse  IgG 
conjugated  to  alkaline  phosphatase  and  diluted  1/3000  in  5%  carnation 
nonfat  dry  milk  in  phosphate-buffered  saline  at  pH  7.4.  The  plates  were 
then  washed  three  times  with  phosphate-buffered  saline.  Detection  was 
accomplished  with  4  mg/ml  phosphatase  substrate  in  0.25  M  diethanol¬ 
amine  with  68  jllM  MgClarGHgO  at  pH  9.8.  The  reaction  was  terminated 
after  1  h  by  adding  50  /xl  of  3  N  NaOH  and  the  absorbance  was  read  at 
405  nm. 

RESULTS 

Molecular  Modeling — Ajnino  acid  sequence  analyses  of  V3 
loops  from  various  HIV-1  strains  show  that  variability  in  amino 
acid  sequence  occurs  mainly  within  specified  regions  of  the  V3 
loop,  leaving  three  regions  that  are  fairly  conserved.  Fig.  lA 
shows  the  North  American  consensus  V3  loop  sequence  and  the 
variability  in  amino  acid  sequence  observed  at  different  sites 
(13).  The  relatively  conserved  regions  are:  (i)  the  N-terminal 
segment  which  generally  includes  a  site  of  glycosylation,  (ii)  the 
GPG  crest,  and  (iii)  the  C-terminal  segment.  Amino  acid  se¬ 
quence  variability  among  different  V3  loop  sequences  is  con¬ 
fined  mainly  to  the  two  regions  flanking  the  GPG  crest.  Fig.  IB 
shows  the  HIV-MN  V3  loop  which,  although  lacking  the  glyco¬ 
sylation  site  at  the  N  terminus,  shows  a  close  sequence  resem¬ 
blance  with  the  North  American  consensus  V3  loop.  Previously, 
we  have  reported  a  method  that  defines  secondary  structural 
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states  of  relatively  conserved  and  highly  variable  regions  of  V3 
loop  sequences  and  predicts  the  energetically  stable  tertiary 
fold(s)  of  the  S-S-bridged  V3  loop  (8, 9).  Using  the  same  method 
we  analyzed  the  secondary  structural  elements  and  the  global 
structure  of  twenty  different  V3  loops.  The  set  of  V3  loop 
sequences  included  the  MN  isolate,  V3  loops  from  different 
geographic  locations,  and  V3  loops  from  isolates  showing  dif¬ 
ferent  tropisms.  The  analyses  predicted  the  following  second¬ 
ary  structural  states  for  the  three  conserved  regions:  (i)  an 
8-residue  long  loop  at  the  N  terminus,  (ii)  a  type  II  j3-turn  at  the 
GPG  crest,  and  (iii)  a  C-terminal  helix,  two-dimensional  NMR 
spectroscopy  revealed  that  these  conserved  structural  features 
were  also  present  in  the  HIV-MN  V3  loop. 

NMR  Experiments — NMR  studies  were  performed  on  the 
control  linear  peptide  only  in  the  aqueous  environment.  A 
complete  sequential  assignment  was  achieved  by  combining 
the  total  correlation  spectroscopy  and  NOESY  data.  NMR  data 
showed  that  only  the  central  principal  neutralizing  determi¬ 
nant  sequence  adopted  a  protruding  loop  with  a  flexible  GPGR 
turn  and  disordered  N-  and  C-terminal  segments.  The  struc¬ 
tural  studies  were  performed  on  the  cyclic  HIV-MN  V3  loop  in 
aqueous  and  in  mixed  (waterATFE)  solvents.  Combination  of 
total  correlation  spectroscopy  and  NOESY  data  in  90%  H2O, 
10%  D2O  was  used  to  obtain  the  sequential  assignment.  Fig.  2, 
A  and  R,  show  the  NOESY  HN-H"  (fingerprint)  and  HN-HN 
regions  for  300  ms  of  mixing  time.  Note  the  presence  of  contin¬ 
uous  HN(i)-H“(i  “  1)  sequential  connectivity  and  a  number  of 
sequential  HN-HN  cross-peaks.  However,  for  structure  deter¬ 
mination  one  requires  relative  strengths  (not  the  mere  pres¬ 
ence)  of  these  cross-peaks.  The  relative  strengths  of  the  sequen¬ 
tial  and  medium  range  NOEs  were  obtained  by  performing 
NOESY  experiments  in  90%  HgO,  10%  DgO  at  two  different 
mixing  times  (150  and  300  m).  In  addition,  27  <p-angle  con¬ 
straints  were  obtained  from  the  double  quantum  filtered  cor¬ 
relation  spectroscopy  data  of  the  MN  V3  loop  in  water.  Various 
sequential  and  medium  range  NOEs  of  the  cyclic  MN  V3  loop  in 
aqueous  solution  are  summarized  in  Fig.  3.  The  full-matrix 
NOESY  analyses  result  in  200  pairwise  inter-proton  distances 
corresponding  to  sequential  and  medium  range  interactions. 
The  200  distance  and  27  dihedral  angle  constraints  indicate  the 
following  effects  of  the  cyclization:  (i)  induction  of  an  N-termi- 
nal  loop  containing  residues  1-9,  (ii)  stabilization  of  the  GPGR 
turn,  and  (iii)  formation  of  two  turns  at  residues  23-26  and 
30-33  in  the  C  terminus.  The  presence  of  these  two  turns  in  the 
C  terminus  reveals  an  incipient  helix  even  in  a  polar  solvent 
like  water. 

NMR  experiments  are,  therefore,  conducted  in  a  less  polar 
water/TFE  (7:3)  mixed  solvent  to  promote  the  formation  of  a 
C-terminal  helix.  Fig.  4,  A  and  R,  show  the  corresponding 
NOESY  HN-H“  (fingerprint)  and  HN-HN  regions  for  300  ms  of 
mixing.  Spectra  in  Fig.  4,  A  and  JB,  indicate  that  the  cross-peaks 
are  better  resolved  in  the  mixed  solvent  although  the  one¬ 
dimensional  signals  in  the  mixed  solvent  are  slightly  broader 
than  in  water.  NMR  data  of  the  HIV-MN  V3  loop  in  the  mixed 
solvent  allow  identification  of  220  NOESY  cross-peaks  includ¬ 
ing  the  intra-residue  HN-H“  NOEs  containing  the  (p-angle  in¬ 
formation.  The  induction  of  the  C-terminal  helix  in  the  mixed 
water/TFE  solvent  is  supported  by  the  following  NMR  data 
characteristic  of  a  helix  (12):  (i)  the  chemical  shift  of  the  H“ 
protons  belonging  to  the  residues  in  the  23-33  segment  shows 
a  solvent-induced  high  field  shift,  (ii)  the  sequential  NH-NH 
cross-peaks  for  the  residues  in  segment  showed  an  appreciable 
solvent-induced  increase  in  the  intensity  relative  to  the  corre¬ 
sponding  sequential  H"-HN  intensities,  (iii)  emergence  of  the 
H“(i)-HN(i  +  3/4)  and  H^-H“(/  +  3/4)  NOESY  cross-peaks  for 
the  C-terminal  residues.  Various  sequential  and  medium  range 
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Fig.  4.  NOESY  cross-sections  of  the  cyclic  MNV3  loop  in  water/ 
TFE  (7:3)  mixtin*e  (peptide  concentration  =  2.5  mM;  pH  4,5;  tem¬ 
perature  =  10  °C).  The  HDO  signal  was  pre-saturated  for  1  s  during 
the  relaxation  delay.  Acquisition  parameters:  data  matrix  (^2  =  2K,  = 

IK),  relaxation  delay  —  1.5  s,  number  of  transients  =  32,  temperature 
=  10  °C.  A,  the  fingerprint  HN-H“  region;  B,  the  HN-HN  region. 

NOEs  of  the  cyclic  MN  V3  loop  in  the  mixed  solvent  are  sum¬ 
marized  in  Fig.  3. 

Solvent-induced  Structural  Changes — A  Monte  Carlo  simu¬ 
lated  annealing  procedure  (11)  subject  to  the  distance  and  the 
torsion  angle  constraints  derived  from  the  NMR  data  leads  to  a 
cluster  of  structures  for  the  MN  V3  loop  in  water  and  in  a  mixed 
water/TFE  solvent.  Fig.  5,  A  and  H,  show  ribbon  diagrams  of 
two  different  folding  patterns  of  the  MN  V3  loop  in  water  and 
in  a  mixed  water/TFE  solvent,  respectively.  In  each  case,  the 
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Fig.  5.  A  and  S,  the  ribbon  diagram  showing  the  average  folding  patterns  of  the  structures  the  MN  V3  loop  in  water  and  in  mixed  water/TFE 
solvent.  In  each  case,  the  average  is  done  over  70  sampled  low  energy  structures.  Note  that,  in  each  case,  the  neutralizing  epitope  containing  the 
central  GPGR  sequence  forms  a  protruding  loop  even  though  the  local  structure  and  presentation  of  the  loop  in  two  cases  are  noticeably  different. 
All  the  sampled  structures  in  A  and  B  showed  rms  deviations  of  0.25  ±  0.01  with  respect  to  140  specified  distance  constraints.  The  structures  that 
satisfy  the  NMR  constraints  of  the  V3-MN  loop  in  water  show  greater  degree  of  flexibility  than  those  in  agreement  with  NMR  data  in  the  mixed 
water:TFE  solvent;  this  is  due  to  the  formation  of  the  C-terminal  helix  in  the  mixed  solvent. 


average  folding  patterns  are  sho'wn  and  averages  computed 
over  70  low  energy  structures  in  agreement  with  the  NMR 
data.  Three  conserved  segments  of  the  V3  loop  are  color  coded: 
yellow  for  the  N-terminal  segment,  red  for  the  GPGR  crest,  and 
cyan  for  the  C-terminal  segment.  Note  that  in  both  folding  motifs 
the  local  secondary  structures  of  the  N-terminal  segment  and  the 
GPGR  crest  remain  the  same;  however,  the  induction  of  the 
C-terminal  helix  in  the  mixed  solvent  changes  the  spatial  inter¬ 
relations  of  the  three  secondary  structural  elements  in  the  two 
structures.  In  addition,  a  short  but  well  defined  j3-strand  confor¬ 
mation  present  in  aqueous  solution  disappears  in  the  mixed 
water/TFE  solvent.  Therefore,  solvent  induced  changes  are  also 
detected  in  the  structure  and  presentation  of  the  neutralizing 
epitope  (comprising  of  the  central  GPGR  and  3—4  flanking  amino 
acids  on  either  side)  of  the  MN  V3  loop. 

The  flexibility  of  the  V3  loop  in  two  solvents  is  markedly 
different.  The  nature  of  flexibility  of  the  two  structures  is 
identified  by  examining  the  standard  deviations  (S.D.)  of  the 
backbone  and  side  chain  torsion  angles  in  these  two  structures 
around  their  average  values.  These  deviations  (observed  by 
analyzing  energy  minimized  structures)  reflect  the  lowest  pos¬ 
sible  values  because  the  thermal  motions  (particularly  for  the 
side  chains)  are  filtered  off  by  energy  minimization.  Tables  I 
and  II  show  the  S.D.  values  in  the  torsion  angles  for  the  V3  loop 
structures  in  the  aqueous  and  mixed  solvents,  respectively. 
Note  that  the  structure  in  the  aqueous  solvent  is  more  flexible 
than  the  structure  in  the  mixed  solvent.  The  flexibility  of  the 
residues  (Lys^^,  Arg^^,  His^^,  Ile^^)  in  the  aqueous  V3  loop 
structure  is  greatly  reduced  in  the  mixed  solvent  structure 
because  of  the  induction  of  a  C-terminal  helix  involving  resi¬ 
dues  23-33.  The  helix  for  residues  23-33  of  the  V3  loop  in  the 
mixed  solvent  is  a  distorted  one  i.e.  the  continuous  stretch  of 
C=O(0  +  4)HN  H-bonds  is  weakened  where  (cp,  i/^)  values 

deviate  from  the  ideal  helix  values  (for  examples,  residues  Ile^® 
and  Gln^^  in  Tables  I  and  II).  Also  note  that  the  residues  20-23 
and  28-34  on  the  C-terminal  side  of  the  V3  loop  in  the  aqueous 
solvent  are  more  flexible  than  the  corresponding  residues  of  the 
V3  loop  in  the  mixed  solvent. 

The  biological  relevance  of  TFE-induced  structural  change  is 
often  questioned.  However,  it  may  be  pointed  out  that  water 
molecules  are  largely  excluded  from  the  surface  of  the  V3  loop 
in  its  active  form  when  it  is  interacting  with  antibodies  or  the 
host-cell  receptor  or  with  other  domains  of  gpl20  (14).  There¬ 
fore,  TFE-induced  structural  changes  may  shed  some  light  on 


the  process  that  accompanies  the  activation  of  the  V3  loop.  In 
addition,  the  physico-chemical  observation  that  the  C-terminal 
residues  of  the  V3  loop  adopt  a  helical  structure  in  the  mixed 
solvent  is  a  testimony  that  the  same  residues  have  intrinsic 
helix  forming  propensity  which  is  masked  in  water  due  to 
competing  water-peptide  H-bonds. 

Monoclonal  Antibody  Binding  Data  for  the  Linear  and  Cyclic 
V3  Loops — ^The  induction  of  structure  due  to  the  S-S  bridge 
between  Cl  and  C35  has  a  strong  bearing  on  the  antibody 
binding  properties  of  the  HIV  V3  loop.  The  lack  of  a  well 
defined  structure  in  the  linear  peptide  is  also  evident  from 
antibody  binding  studies.  Binding  of  linear  and  cyclic  V3-MN 
loops  to  three  different  monoclonal  antibodies  is  compared  in 
Fig.  6.  Antibodies  1510,  1511,  and  1289  bind  to  the  V3  epitopes 
KRIHI,  HIGPGR,  and  GPGRAF,  respectively  (15).  Note  that 
the  cyclic  V3-MN  loop  is  a  better  ligand  than  the  linear  analog 
in  all  three  cases.  This  is  consistent  with  the  NMR  evidence 
that  the  cyclic  V3-MN  loop  is  more  structured  than  the  linear 
analog.  As  expected,  the  most  pronounced  difference  in  binding 
occurs  for  the  mAB  1510  which  recognizes  the  sequence  KRIHI 
on  the  N-terminal  side  of  the  GPG  crest;  this  sequence  also 
shows  more  ordered  structure  upon  cyclization.  For  the  other 
two  antibodies,  the  difference  in  binding  is  smaller,  because 
both  of  them  include  the  GPGR  which  even  in  the  linear  analog 
shows  a  residual  turn.  Therefore,  the  NMR  and  antibody  bind¬ 
ing  studies  imply  that  vaccine  attempts  using  the  cyclic  V3  loop 
would  be  more  effective  than  the  linear  analog  in  inducing 
protective  humoral  immunity  to  the  conserved  structural  fea¬ 
tures.  The  binding  profile  of  human  monoclonal  antibodies 
1510  and  1511  (both  derived  from  HIV  infected  patients)  rein¬ 
forces  the  notion  that  the  cyclic  V3  loop  presents  the  epitope 
structures  similar  to  that  found  in  native  gpl20.  Interestingly, 
the  Lys^®-Arg^^-Ile^^-His^^-Ile^^-Gly^^-Pro^®-Gly^'^  fragment 
which  is  a  part  of  the  neutralizing  epitope  of  the  cyclic  MN  V3 
loop  shows  the  same  structure  in  water  and  in  the  mixed 
solvent  as  in  the  complex  co-crystal  of  the  neutralizing  anti¬ 
body  and  the  MN  V3  loop  peptide  antigen  complex  (16). 

DISCUSSION 

The  following  conclusions  can  be  drawn  based  on  the  data  of 
the  HIV-MN  V3  loop  presented  in  this  article,  (i)  The  S-S 
bridge  between  Cl  and  C35  introduces  structure  in  the  V3  loop, 
(ii)  The  overall  tertiary  folding  of  the  V3  loop  as  well  as  the  local 
structure  at  the  PND  are  critical  in  deciding  the  affinity  of  the 
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Table  I 

The  average  values  (in  degrees)  of  the  backbone  and  side  chain  torsion  angles  and  their  standard  deviations  of  the  sampled  structures  in 

agreement  with  the  NMR  data  of  the  MN  V3  loop  in  water 
The  backbone  torsion  angles  (marked  *)  show  bimodal  distributions. 
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V3  loop  for  antibody  binding,  (hi)  The  structure  of  the  HIV  V3 
loop  is  intrinsically  flexible  and  structural  transitions  of  the 
loop  are  possible  due  to  a  subtle  change  in  the  environment  (for 


example,  the  effect  of  TFE)  (17).  In  the  total  correlation  spec¬ 
troscopy  and  NOESY  spectra  of  the  MN  V3  loop  in  waterATFE, 
we  observed  a  broadening  of  the  NMR  line  due  to  the  mixed 
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Table  II 

The  average  values  (in  degrees)  of  the  backbone  and  side  chain  torsion  angles  and  their  standard  deviations  of  the  sampled  structures  in 
agreement  with  the  NMR  data  of  the  MN  V3  loop  in  a  mixed  water :TFE  (7:3)  solvent 


The  backbone  torsion  angles  (marked  *)  show  bimodal  distributions. 
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solvent.  However,  unlike  the  suggestion  made  in  a  previous 
work  (18),  the  broadening  was  not  big  enough  to  hinder  com¬ 
plete  sequential  assignments  and  NOE  determination.  We  as¬ 


sume  this  to  be  due  to  the  fact  that  we  used  a  low  peptide 
concentration  of  2.5  mM  in  the  mixed  solvent  experiments 
which  prevented  the  formation  of  aggregates,  (iv)  The  amino 
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Fig.  6.  An  enzyme4inked  immu¬ 
nosorbent  assay  showing  the  prefer¬ 
ence  of  monoclonal  antibodies  for  the 
cyclic  over  the  linear  form  of  the 
HIV-MN  V3  loop.  Human  monoclonal 
antibodies  1510  {top)  and  1511  {center), 
and  mouse  antibody  1289  {bottom)  all 
bind  to  a  greater  extent  to  the  cyclic  V3 
loop  peptide.  The  recognized  epitopes 
1510  {top),  1511  {center),  and  1289  {bot¬ 
tom)  are  shown  on  the  right.  In  the  sche¬ 
matic  representations  of  the  HIV-MN  V3 
loop  shown  on  the  right,  solid  circles  de¬ 
pict  hydrophobic  residues,  open  circles 
charged  residues,  and  outlined  circles  po¬ 
lar  uncharged  residues. 


acid  sequence  variability  of  the  V3  loop  is  restricted  on  the  two 
sides  of  the  GPG  crest  (19).  Amino  acid  sequence  variability  in 
the  regions  flanking  the  conserved  GPGR  turn  can  alter  the 
stability  of  the  turn  and/or  alter  (camouflage)  the  surface  ac¬ 
cessibility  of  this  conserved  secondary  structural  element. 
Therefore,  structural  studies  such  as  ours  on  this  type  of  se¬ 
quence  variability  will  be  useful  in  the  PND-based  vaccine 
design  and  also  in  deciphering  the  role  of  the  V3  loop  in  cell 
fusion  (20,  21)  and  cell  tropism  (22,  23)  where  structure  and 
presentation  of  the  V3  loop  might  be  crucially  important. 
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